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Preface 



This book summarizes ongoing research introducing probabihty space isomorphic map- 
pings into the strategy spaces of game theory. 

This approach is motivated by discrepancies between probabihty theory and game 
theory when apphed to the same strategic situation. In particular, probabihty theory and 
game theory can disagree on calculated values of the Fisher information, the log likelihood 
function, entropy gradients, the rank and Jacobian of variable transforms, and even 
the dimensionality and volume of the underlying probability parameter spaces. These 
differences arise as probability theory employs structure preserving isomorphic mappings 
when constructing strategy spaces to analyze games. In contrast, game theory uses weaker 
mappings which change some of the properties of the underlying probability distributions 
within the mixed strategy space. Here, we explore how using strong isomorphic mappings 
to define game strategy spaces can alter rational outcomes in simple games . 

Specific example games considered are the chain store paradox, the trust game, the 
ultimatum game, the public goods game, the centipede game, and the iterated prisoner's 
dilemma. In general, our approach provides rational outcomes which are consistent with 
observed human play and might thereby resolve some of the paradoxes of game theory. 

0.1 Acknowledgments 

The author gratefully acknowledges a fruitful collaboration with Kae Nemoto. 
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Chapter 1 

Strong isomorphisms in strategy 
spaces 

1.1 Introduction 

1.1.1 Irreducible complexity of strategic optimization 

The essential problem of economics and the rational for game theory was first posed 
by von Neumann and Morgenstern [1]. They described the fundamental economic opti- 
mization problem by contrasting the non-strategic single player case with the strategic 
multi-player situation. In particular, they stated the non-strategic case is "an economy 
which is represented by the 'Robinson Crusoe' model, that is an economy of an isolated 
single person, or otherwise organized under a single will." In this economy, "Crusoe faces 
an ordinary maximization problem, the difficulties of which are of a purely technical — and 
not conceptual — nature" . This non-strategic case was contrasted with a strategic "social 
exchange economy [where] the result for each one will depend in general not merely upon 
his own actions but on those of the others as well. . . . This kind of problem is nowhere 
dealt with in classical mathematics. . . . this is no ordinary maximization problem, no 
problem of the calculus of variations, of functional analysis, etc" fi\. 

Thus, von Neumann and Morgenstern essentially claimed that strategic optimization 
problems were irreducibly more complex than non-strategic optimization problems. And 
yet, after learning a few new techniques, the solution of strategic games turns out to be 
not significantly more complex than the solution of non-strategic decision trees — larger 
and more difficult certainly, but not irreducibly more complex. In this work, we claim 
that the proposed solution to strategic analysis is incomplete. We will argue that strategic 
optimization is indeed irreducibly more complex than non-strategic optimization, and this 
irreducible complexity is missing from current formulations of strategic optimization. 

We will look for this missing irreducible complexity by applying probability theory 
and game theory to the same strategic situation, and examining any differences that 
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arise. We will show that when applied to the same strategic game, probability theory 
and game theory can disagree on calculated values of the Fisher information, the log 
likelihood and entropy gradients, the rank and Jacobian of variable transforms, and even 
the dimensionality and volume of the underlying probability parameter spaces. These 
differences arise as probability theory employs structure preserving, isomorphic mappings 
when constructing a mixed strategy space to analyze games. In contrast, game theory 
uses weaker mappings which change some of the properties of the underlying probability 
distributions within the mixed strategy space. We will explore how using strong iso- 
morphic mappings to define mixed strategy spaces can alter rational outcomes in simple 
games, and might resolve some of the paradoxes of game theory. 

1.1.2 Strategy spaces of game theory 

One possibly fruitful way to gain insight into the paradoxes of game theory is to show 
that probability theory and game theory analyze simple games differently. It would be 
expected of course that these two well developed fields should always produce consistent 
results. However, we will show in this paper that probability theory and game theory 
can produce contradictory results when applied to even simple games. These differences 
arise as these two fields construct mixed strategy spaces differently. 

The mixed strategy space of game theory is constructed, according to von Neumann 
and Morgenstern, by first making a listing of every possible combination of moves that 
players might make and of all possible information states that players might possess. This 
complete embodiment of information then allows every move combination to be mapped 
into a probability simplex whereby each player's mixed strategy probability parameters 
belong to "disjoint but exhaustive alternatives, . . . subject to the [usual normalization] 
conditions . . . and to no others." pQ. The resulting unconstrained mixed strategy space 
is then a "complete set" of all possible probability distributions that might describe the 
moves of a game [H |2l |3l SI |5]. Further, the absence of non-normalization constraints 
ensures "trembles" or "fluctuations" are always present within the mixed strategy space 
so every possible pure strategy probability distribution is played with non-zero (but 
possibly infinitesimal) probability [6J. Together, these properties of the mixed strategy 
space — a complete set of "contained" probability distributions, no additional constraints, 
and ever present trembles — lead to inconsistencies with probability theory. 

1.1.3 Isomorphic probabihty spaces 

In constructing a mixed strategy space, probability theory first examines how subsidiary 
probability distributions can be "contained" within a mixed space and whether the prop- 
erties of the probabihty distributions are altered as a result. Probability theory uses 
isomorphisms to implement mappings of one probability space into another space. An 
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isomorphism is a structure preserving mapping from one space to another space. In 
abstract algebra for instance, an isomorphism between vector spaces is a bijective (one- 
to-one and onto) hnear mapping between the spaces with the imphcation that two vector 
spaces are isomorphic if and only if their dimensionality is identical [7] . When the preser- 
vation of structure is exact, then calculations within either space must give identical 
results. Conversely, if the degree of structure preservation is less than exact, then dif- 
ferences can arise between calculations performed in each space. It is thus crucial to 
examine the fidelity of the "containment" mappings used to construct the mixed spaces 
of game theory. Probability theory defines isomorphic probability spaces as follows. We 
give two definitions for completeness, see Refs. |S1 El HO] • 

Definition 1: A probability space V = {Q,cr,P} consists of a set of events fl, a 
sigma-algebra of all subsets of those events a, and a probability measure defined over 
the events P. Two probability spaces V = {Q, a, P} and V = {Q', cr', P'} are said to be 
strictly isomorphic if there is a bijective (1-to-l and onto) map f : Q ^ Q' which exactly 
preserves assigned probabilities, so for all e G f2 we have -P(e) = P'[f{e)]. A slight 
weakening of this definition defines an isomorphism as a bijective mapping / of some 
unit probability subset of Q onto a unit probability subset of Q'. That is, the weakened 
mapping ignores null event subsets of zero probability. 

Definition 2: Two probability spaces V = {Q,a,P} and V = {Q',cr',P'} are 
isomorphic if there are null event sets il'' G i7 and Q'^ G Q' and an isomorphism 
/ : {Q — Q^) -> {Q' — Q'^) between the two measurable spaces {Q — Q^, a) and {Q' — Q'°, a') 
with the added properties that P'{F) = P[f-\F)] for F e a' and P{G) = P'[fiG)] for 
G E a. In other words, an isomorphism exists if there is an invertible measure-preserving 
transformation between the unit probability events in each space, {fl — ^2") G fl and 
{Q' — Q'^) G Q'. This also implies that the null probability event sets of each space are 
mapped to each other. 

In particular, we note that strong isomorphisms between source and target probability 
spaces require they have identical dimensionality and tangent spaces [TT] . 

1.1.4 Isomorphism choice alters optimization outcomes 

The mixed strategy space of game theory "contains" different probability distributions 
many possessing different dimensionality (according to probability theory). Their altered 
dimensionality within the mixed space can alter those computed outcomes dependent on 
dimensionality. A simple illustration of this process can make this clear. 

A 1-dimensional function f{x) can be embedded within a 2-dimensional function 
g{x,y) in two ways: using constraints g{x,yo) = f{x), or limits liiny^yQg{x,y) = f{x). 
In either case, many of the properties of the source function f{x) are preserved, but not 
necessarily all of them. In particular, these different methods alter gradient optimization 
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calculations. That is, the gradient is properly calculated when constraints are used, 
f'{x) = g'{x,yo), but not when a limit process is used, f'{x) 7^ liniy^yQ Vg{x,y) (where 
V indicates a gradient operator). 

We note our use of gradient operators is unusual in game theory. In lieu of gradient 
operators, the rational players of game theory generally simply compare the values of 
expected payoff functions at different points within a probability space. However, we 
remind ourselves that every comparison of an expected payoff function over a probability 
space is equivalent to evaluating a gradient. Specifically, a function Il{x,y) with expec- 
tation (n(a)) compared at the points ai and 02 within a probability space employs the 
identity 

(n(a2)) - (n(ai)) = V(n(a)).rf2i, (1.1) 

where the distance vector is ^21 = 0(02 ~ ^i)- This results as all expectations are poly- 
linear in each probability parameter. 

1.1.5 Mismatch between probability and game theory 

In this paper, we will show that exactly the same discrepancies arise when probability 
theory and game theory are applied to simple probability spaces, and that these discrep- 
ancies can be significant. It is useful to indicate the magnitude of these discrepancies 
here to motivate the paper (with full details given in later sections below). We con- 
sider a simple card game with two potentially correlated variables x,y E {0,1} with 
joint probability distribution P^y. In the case where x and y are perfectly correlated, 
probability theory (denoted by P) and game theory (denoted by G) respectively assign 
different dimensions to both the Fisher information matrix (F) and the gradient of the 
log Likelihood function (VL), and can disagree on the value of the gradient of the joint 
entropy at some points (VE^y): 

P G 



dim(F) 
dim(VL) 

|V-Ej.j,| 



(1.2) 
13 

00. 



These fields also disagree on the probability space gradients of both the normalization 
condition (Pqo + -Pii = 1) and the requirement that the joint entropy equates to the 
marginal entropy {E^y — E^ = 0): 

P G 



V{Poo + Pu) 7^0 (1.3) 

V (E^y -E^) ^ 0. 

Should these fields model a change of variable within this game, they further disagree 
on the rank of the transform matrix (A), and on the invertibility of the Jacobian matrix 
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(J): 

P G 





p 


G 


d 


1 


3 


V 


1 


1 

6' 



Rank(A) 1 2 (1.4) 

J Singular Invertible. 

These fields even disagree on the dimension [d) and volume (V^) of the minimal probability 
space used to analyze the game: 



(1.5) 



The differences between game theory and probability theory arise due to the different 
use of isomorphic mappings to construct mixed strategy spaces. 

We now show the necessity for considering isomorphic probability spaces using exam- 
ples ranging from simple dice games to bivariate normal distributions. 

1.2 Optimization and isomorphic probability spaces 

In this section, we introduce the need to use isomorphic mappings when embedding 
probability spaces within mixed spaces. 

1.2.1 Isomorphic dice 

Gonsider the three alternate dice shown in Fig. II. II representing a 2-sided coin, a 3-sided 
triangle, and a 4-sided square. Faces are labeled with capital letters and the probabilities 
of each face appearing are labeled with the corresponding small letter. The corresponding 
probability spaces defined by these die are 

^coin = {xe{A,B},{a,b}} 

"^triangle = {x E {A, B, C} , {a, b, c}} 
Square = {x E {A, B , C, D} , {a,b, C, d}}. (1.6) 

Here the required sigma- algebras are not listed, and each of these spaces are subject 
to the usual normalization conditions. For notational convenience we sometimes write 
{.PiyP2,P3,P4) = {a,b,c,d) and denote the number of sides of each respective die as n e 
{2,3,4}. In each respective die space, the gradient operator is 

where a hatted variable pi is a unit vector in the indicated direction and we resolve the 
normalization constraint via Pn = I — J27=i Pi- 
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B,b 

A, a 




A, a 



C,c 



D,d 



B,b 



A, a 



Figure 1.1: Three alternate dice with different numbers of sides. A coin with sides A 
and B appearing with respective probabilities a and b, a triangle with faces A, B and C 
occurring with respective probabilities a, h and c, and a square die with faces A, 5, C and 
D each occurring with respective probabilities a, 6, c and d. 

We now wish to optimize a nonlinear function over these spaces, and we choose a 
function which cannot be optimized using standard approaches in game theory. The 
chosen function is 

/ = V^E,, (1.8) 

with 



V 



dv 



space 
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n 

Ex = - XI Pi log Pi' (1-9) 

where V is the volume of each respective probabihty parameter space and E^ is the 
marginal entropy of each space [12] . We will complete this optimization in three different 
ways, two of which will be consistent with each other and inconsistent with the third. 

As a first pass at optimizing the function /, we simply maximize / within each prob- 
ability space and then compare the optimal outcomes to determine the best achievable 
outcome. As is well understood, the entropy of a set of n events is maximized when those 
events are equiprobable giving a maximum entropy of -Ex,max = logn. In addition, for 
the coin we have 

V = da db 5a+b=i 

Jo Jo 

= da 

Jo 

= 1 
Ex = -[alog(a) + (1 -a)log(l - a)] 

VEx = -aiog-^. (1.10) 

1 — a 
For the triangle, the equivalent functions are 

r-l rl rl 



V = da db dc 6a+b+c=i 

Jo Jo Jo 



1 rl-a 

da / db 
Jo 

1 

2 
Ex = -[alog(a) + 61og(6) + (l-a-6)log(l-a-6)] 

VEx = -alog- ^-Mog- ^——. (1.11) 

1 — a — b 1 — a — b 

Finally, for the square, we have 

rl /•! /•! /•! 



V = da db dc dd 6a+b+c+d=i 

Jo Jo Jo Jo 



1 rl rl — a—b 

da db dc 

Jo Jo 

1 

6 

Ex = — [alog(a) + 61og(6) + clog(c) + (1 — a — 6 — c) log(l — a — 6 — c)] 

VEx = -alog- ^- 6 log- clog- ^- . (1.12) 

1 — a — b — c 1 — a — — c 1 — a — b — c 

Consequently, the function / takes maximum values in the three probability spaces of 

/coin, max ^Og -^ 

log 3 



/triangl 



4 



log 4 

/square, max TTZ ' \ / 
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Comparing these outcomes makes it clear that the best that can be achieved is to use a 
coin with equiprobable faces. 

The second method uses isomorphisms to map all of the three incommensurate source 
spaces into a single target space. We choose our mappings as follows: 

^coin = {xe{A5,C,D},{a,6,c,4}|(,,)^(oo) 
Kiangio = {xe{A,B,C,D},{a,b,c,d}}\^^Q 

^square = {x E {A, B , C, D} , {a,b, C, d}} . (1.14) 

Here, while all probability spaces share a common event set and probability distribu- 
tion, the isomorphic mappings impose constraints on the "Pcoin ^^^ ^triangle spaces. The 
constraints arise from mapping the null sets of zero probability from each source space 
to the corresponding events of the enlarged target space. The target probability space 
is shown in Fig. 11.21 where the normalization condition d = l — a — b — cis used. The 
points corresponding to the probability spaces of the coin "Pcoin ^^^ mapped along the line 
a + b = 1 with constraint {c,d) = (0,0). Those points corresponding to the probability 
spaces of the triangle "Ptriangie ^^^ mapped along the surface a + b + c = 1 with constraint 
d = 0. Finally, the probability spaces corresponding to the square "Pgquare ^11 ^^e volume 
a + b + c + d= 1 and are not subject to any other constraint. 

The interesting point about the target space is that many points, e.g. (a, b, c, d) = 
(i, |, 0, 0), lie in all of the probability spaces of the coin, triangle, and square die and are 
only distinguished by which constraints are acting. That is, when this point is subject to 
the constraint (erf) = (00), then it corresponds to the probability space "Pcoin (^^^ not to 
any other). Conversely, when this same point is subject to an imposed constraint d = 
then it corresponds to the probability space Ptriangie- Finally, when no constraints apply 
then, and only then does this point correspond to the probability space of the square 
^square- This mcaus that it is not the probability values possessed by a point which 
determines its corresponding probability space but the probability values in combination 
with the constraints acting at that point. 

It is now straightforward to use the isomorphically constrained target space to max- 
imize the function / over all embedded probability spaces using standard constrained 
optimization techniques. For instance, to optimize / over points corresponding to the 
coin and subject to the constraint (c, d) = (0, 0) then either simply resolve the constraint 
via setting c = d = before the optimization begins, or simply evaluate the gradient 
of / at all points (a, b, 0, 0) in the direction of the unit vector -75(1, —1, 0, 0) lying along 
the line a + b = 1. In more detail, the function /(a, b, c) has a directed gradient in the 
direction 4^(1, —1, 0) of 

V/(a, 6, c).^(l, -1,0) = V'^ log ^ (1.15) 
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Figure 1.2: The target space containing points corresponding to the probability spaces 
respectively of the coin Pcoin (^long the line a + b = 1 with constraint {c,d) = (0,0) (heavy 
line), of the triangle "Ptriangic (^long the surface a + b + c = 1 with constraint d = (hashed 
surface), and of the square "Pgquare f^lU'i^g th^ volume a + b + c + d = 1 (filled polygon). 
Note that points such as {a,b,c) = (0.5,0.5,0) correspond to all three probability spaces 
and are only distinguished by which constraints are acting. 



using Eq. 11.121 The rate of change of / with respect to the only remaining variable a is 

given by 

df 



da v^V/.i=(l,-l,0). 



;i.i6) 



Altogether, at points where (a, b, c) = (a, 1 — a, 0) this gives a directed gradient of 



da 



1 — a 



;i.i7) 



which is optimized at (a, b, c) 



'1 1 

^2' 2' 



0). An optimization over all three isomorphic 



constraints leads to the same outcomes as obtained previously in Eq. I1.13l with the same 
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result. This completes the second optimization analysis and as promised, it is consistent 
with the results of the first. 

The same is not true of the third optimization approach which produces results in- 
consistent with the first two. The reason we present this method is that it is in common 
use in game theory. The third optimization method commences by noting that the prob- 
ability space of the square is complete in that it already "contains" all of probability 
spaces of the triangle and of the coin. This allows a square probability space to mimic 
a coin probability space by simply taking the limit {c,d) — )■ (0,0). Similarly, the square 
mimics the triangle through the limit rf — ?■ 0. In turn, this means that an optimization 
over the space of the square is effectively an optimization over every choice of space 
within the square. Specifically, game theory discards constraints to model the choice 
between contained probability spaces. This optimization over the points of the square 
has already been completed above. When optimizing the function / over the uncon- 
strained points corresponding to the square, the maximum value is / = log(4)/36 at 
{a,b,c,d) = (|,|,|,|), and according to game theory, this is the best outcome when 
players have a choice between the coin, the triangle, or the square. 

The optimum result obtained by the third optimization method, that used by game 
theory, conflicts with those found by the previous two methods as commonly used in 
probability theory. The difference arises as game theory models a choice between proba- 
bility spaces by making players uncertain about the values of their probability parameters 
within any probability space. Consequently, their probability parameters are always sub- 
ject to infinitesimal fluctuations, i.e. c > 0^ or rf > 0^ always. These fluctuations alter 
the dimensions of the space which impacts on the calculation of the volume V and alters 
the calculated gradient of the entropy. Game theory eschews the role of isomorphism con- 
straints within probability spaces on the grounds that any such constraints restrict player 
uncertainty and hence their ability to choose between different probability spaces. The 
probability parameter fluctuations mean that players have access to all possible proba- 
bility dimensions at all times so a single mixed space is the appropriate way to model the 
choice between contained probability spaces. In contrast, probability theory holds that 
the choice between probability spaces introduces player uncertainty about which space to 
use, but specifically does not introduce uncertainty into the parameters within any indi- 
vidual probability space. As a result, probability theory employs isomorphic constraints 
to ensure that the properties of each embedded probability space within the mixed space 
are unchanged. 

The upshot is that a game theorist cannot evaluate the Entropy (or uncertainty) 
gradient of a coin toss while considering alternate die because uncertainty about which 
dice is used bleeds into the Entropy calculation. However, the probability theorist will 
distinguish between their uncertainty about which face of the coin will appear and their 
uncertainty about which dice is being used. 
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1.2.2 Alternate coin probability spaces 

The preceding section has shown the importance of using isomorphism constraints to 
preserve the properties of the coin probabihty space "Pcoin when embedded within larger 
spaces. However, isomorphism constraints must also be used in the very definition of 
a probability space. If a probability space is to be defined to match some physical 
apparatus, then a structure preserving isomorphic mapping must be established between 
the physical apparatus and the probability space. We illustrate this now by adopting 
several different probability spaces for a coin. 

In the preceding sections, we have the physical coin as shown in Fig. 11.11 and its 
corresponding probabihty space as defined in Eq. II. 6[ To reiterate, 

Vcoin = {xe{A,B},{a,b}}. (1.18) 

After taking account of the normalization constraint b = 1 — a, the gradient operator in 
this space is 

V = a^. (1.19) 

If we define a payoff via the random variable 11(^4) = and n(i?) = 1, then a gradient 
optimization gives 

V(n) = VP{B) 

= -a (1.20) 

indicating that expected payoffs are maximized by setting a = as expected. 

There are many very different formulations possible for the probability space of a 
simple two sided coin, and these are considered to be functionally identical only after the 
appropriate structure-preserving isomorphisms have been defined. Every alternative in- 
troduces a different parameterization which alters dimensionality and gradient operators 
and modifies the optimization algorithm. We illustrate this now. 

Our coin could be optimized using a probability measure space V^^m involving two 
uncorrelated coins, namely 

n'oin = {{x, y) e {(0, 0), (0, 1), (1, 0), (l, l)}, {(l-p)(l-g), (i-p)g,p(i-g), Ml- (1-21) 

An isomorphism can be defined by mapping event A onto the event set (x, y) G 
{(0,0), (1, 1)} and B onto (x, y) G {(0, 1), (1,0)}. In this space, the gradient operator is 

^=PTr + qjr- (1-22) 

op oq 

and a gradient optimization of the expected payoff gives 

V(n) = WP{B) 

= p{l-2q) + q{l-2p). (1.23) 
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This shows that when q < \ then payoffs are maximized by setting p = 1 and conversely, 
when p < I then payoffs are maximized by setting g = 1. 

Alternatively, the binary decision could be optimized using a continuously param- 
eterized probability measure space "Pcom- ^^ ^^^^ space, the choices A and B might be 
determined using a continuously distributed variable u G (— oo, oo) possessing a normally 
distributed probability distribution 



P(u) 



2na 



;i.24) 



with mean u, standard deviation a, and variance cr^. We introduce a new parameter, p, 
so outcome A occurs with probability 



1 



P{A) = 

while outcome B occurs with probability 

1 



"P 1 (u-u)^ 

du e 2 <T^ 
27rcr J-oo 



P B) 



1 (u-U)^ 

du e 2 <T^ 



27rcr ^p 

This space has only one probability parameter p so the gradient operator is 

_ . 9 



p 



dp' 



;i.25) 



(1.26) 



;i.27) 



and optimizing the expected payoff gives 

v(n) = 



'27ro" jp 
-VF{p), 



_1 (u-ii)2 

(i-u e 2 <t2 



:i.28) 



where -F^)) is the cumulative normal distribution. As the cumulative normal distribution 
is monotonically increasing, V-F(p) > 0, so the expected payoff is maximized by setting 
p — 7- — oo giving P{B) = 1 as expected. 

For a more extreme alternative, consider a quantum probability measure space Pcoin 
in which event A corresponds to a measurement finding a two-state quantum system 
in its ground state, and event B occurs when the measurement finds the system in its 
excited state. Writing the quantum system state as 



|M/) 



where a and b are complex numbers satisfying |ap -|- |6p = 1, then we have P{A) 
and P{B) = |6p. In this space, the payoff is an operator 



(1.29) 



n 




1 



;i.30) 
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giving the expected payoff as 

(n) = (^|n|^) 
= \b\' 

= r\ (1.31) 

where in the last hne we write h = re*^ with real < r < 1 and < 6 < 27t. Here, 
the expected payoff depends only on the single real variable r so optimization is via the 
gradient operator 

V = f-^ (1.32) 

giving 

V(n) = 2r. (1.33) 

As required, maximization requires setting r = 1, with 6 arbitrary. 

For a last example, consider a probability space "Pcom which selects a number u in the 
Cantor set C with uniform probability P{u) such that when u < p then event A occurs 
while when p < u then event B occurs. The Cantor set C is interesting as it has an 
uncountably infinite number of members and yet has measure zero [13]. In this space, 
the expected payoff is 

(n) = E^^nM 

uec 

= E Pin) 

u>peC 
= l-C{p), (1.34) 

where C{p) is the cumulative probability distribution termed the Cantor function. Inter- 
estingly, the Cantor function is an example of a "Devil's staircase" , a function which is 
continuous but not absolutely continuous everywhere, and is differentiable with deriva- 
tive zero almost everywhere, and which maps the measure zero Cantor set continuously 
onto the measure one set [0,1] [13]. As with the normal distribution example above, 
the Cantor function is nondecreasing allowing an intuitive maximization of the expected 
payoff via the gradient operator 

V ^ I; (1.35) 

giving 

V(n) = -^. (1.36) 

dp 

As the cumulative normal distribution is nondecreasing, we have —^ > so the expected 
payoff is maximized by setting p = 0. This intuitive ansatz suffices for our purposes here. 
Lastly, the player is of course, not restricted to using only simple probability mea- 
sure spaces, and more complicated spaces can be considered. In fact, players will most 
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likely use a pseudo-random number generator consisting of the correlated dynamical in- 
teractions of some millions (or more) of electronic components in a computer. It is only 
the correlations of these millions of variables that allows a dimensionality reduction to 
the few variables required to model the player's chosen probability space. Isomorphisms 
underlie the dimensionality reductions of random number generators. 

To summarize, optimizing an expected payoff first requires the adoption of a suit- 
able probability measure space, and it is only the adoption of such a space that permits 
the definition of gradient operators and the expected payoff functions allowing the op- 
timization to be completed. These steps involve establishing an isomorphic mapping 
from the physically modeled space to the probability space which is property conserving. 
Of course, should the probability space then be embedded within any other probability 
space, these properties must still be conserved, and this will require additional isomorphic 
constraints. 




Figure 1.3: A four-sided square probability space where joint variables x and y take values 
{x,y) G {(0, 0), (0, 1), (1, 0), (1, 1)} with respective probabilities {a,b,c,d). 



1.2.3 Joint probability space optimization 

We will briefly now examine isomorphisms between the joint probability spaces of two 
arbitrarily correlated random variables. In particular, we consider two random variables 
x,y as appear on the square dice of Fig. 11.31 with probability space 



n. 



square 



{{x, y) e {(0, 0), (0, 1), (1, 0), (1, 1)}, {a, 6, c, d}}. 



1.37) 
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The correlation between the x and y variables is 

{xy) - {x){y) 



Pxy 

UxUy 

ad — he , ^ 

1.38 



^J{c + d){a + h){h + d){a + c) 



Here, ax and ay are the respective standard deviations of the x and y variables. 

The space "Psquarc of course contains many embedded or contained spaces. We will 
separately consider the case where x and y are perfectly correlated, and where they are 
independent. As noted previously, there are two distinct ways for these spaces to be 
contained within T-square, namely using isomorphism constraints or using limit processes. 
These two ways give the respective definitions for the perfectly correlated case 



b=c=Q 



Peorr = {(x, y) G {(0, 0), (0, 1), (1, 0), (1, 1)}, {a, 6, C, 4}| 

V',,„ = lim {(x,y)G{(0,0),(0,l),(l,0),(l,l)},{a,6,c,4} (1.39) 

(fec)-s>(00) 



and for the independent case 



Pind = {(x,i/)G{(0,0),(0,l),(l,0),(l,l)},{a,6,c,4}| 



ad=bc 



n,d = lim {(x,y)G{(0,0),(0,l),(l,0),(l,l)},{a,6,c,4}. (1.40) 

Here, all spaces satisfy the normalization constraint a + b + c + d = 1, which we typically 
resolve using d = 1 — a — b — c. The gradient operator in the probability space of the 
square dice with probability parameters (a, b, c) is 

d " d d 

where a hat indicates a unit vector in the indicated direction. Evaluating any function 
dependent on a gradient or completing an optimization task using either isomorphic con- 
straints or limit processes can naturally result in different outcomes as we now illustrate. 

Perfectly correlated probability spaces 

We first consider the case where the x and y variables are perfectly correlated in the 
spaces "Pcorr with isomorphism constraints or Pporr using limit processes. 

The maximum achievable joint entropy |12] for our two perfectly correlated variables 
obviously occurs at the point where they are equiprobable. This can be found by evalu- 
ating the gradient of the joint entropy function 

Exy{a,b,c) = -Y,Pxy\ogPxy (1.42) 

xy 

= —a log a — b log b — c log c— (1 — a — 6 — c) log(l — a — b — c) 
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giving respective gradients in the Pcorr and P^orr spaces of 



V^. 



xy I b=c=0 



VE. 



xy 



-a log 



—a log 



1 — a 



lim VE. 

{6c)->{00) 



xy 



1 — a — b — c 
undefined. 



6 log 



1 — a — 6 — c, 



— clog 



1 — a — b — c 
(1.43) 



Equating these gradients to zero locates the maximum at {a,b,c) = (|, 0, 0) in Pcorr and 

at {a,b,c) = (^,|'i) in^corr- 

The Fisher Information is defined in terms of probability space gradients as the 
amount of information obtained about a probability parameter from observing any event 
|12] . Writing {a,b,c) = {pi-,P2:Pz)i the Fisher Information is a matrix with elements 
i,j G {1,2,3} with 



F,- = E^^.(|-log^^.)(|-log^^.) 



1.44) 



When isomorphically constrained in the space "Pcorr, the Fisher Information is -Fjj|b=c=o 
with the only nonzero term being 



Fii 



[I -a) 
1 



a— log(l -a) 



+ a 



a-- log a 
da 



a(l 



a 



(1.45) 



This means that the smaller the Variance the more the information obtained about a. In 
the unconstrained space P^orr' the Fisher Information is a very different, 3x3 matrix. 

Probability parameter gradients also allow estimation of probability parameters by 
locating points where the Log Likelihood function is maximized VlogL = ^^. This 
evaluation takes very different forms in the isomorphically constrained space "Pcorr and the 
unconstrained space "Pcorr- The likelihood function estimates probability parameters from 
the observation of n trials with Ua appearances of event (x, y) = (0, 0), Ub appearances of 
event (x, y) = (0, 1), ric appearances of event (x, y) = (1, 0), and rid appearances of event 
(x, y) = (1, 1). We have Ua + rih + ric + Ud = n, giving the Likelihood function 



L = f{na, rib, ric, n)a"''6"^c"^(l -a-b- c)""""-""-"- 



(1.46) 



where f {ria, rib, ric, n) gives the number of combinations. The optimization proceeds by 
evaluating the gradient of the Log Likelihood function. When isomorphically constrained 
in the space Pcorr, the gradient of the Log Likelihood function is 



VlogL|b= 



Un n-rin 



c=Q 



(1.47) 
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which equated to zero gives the optimal estimate at a = Ha/n and rif, = Uc = as 
expected. Conversely, when unconstrained in the space Pcorr; ^^^ gradient of the Log 
Likelihood function evaluates as 



VIokL 



a 



n-Ua-nh 



rir 



+c 



a — b — c 

rir n — n, 



a 



h ~ 

rib -nc 



rir: 



1 



a 



:i.48i 



c 1 — a — b — c 

This is obviously a very different result. However, in our case the same estimated out- 
comes can be achieved in both spaces. For example, if an observation of n trials shows n^ 
instances of {x,y) = (0,0) and n ~ ria instances of {x,y) = (1, 1) then both constrained 
and unconstrained approaches give the best estimates of the probability parameters of 
{a,b,c,d) = if, 0,0,1-^). 

Finally, when x and y are perfectly correlated it is necessarily the case that expecta- 
tions satisfy (x) — (y) = 0, that variances satisfy V{x) — V{y) = 0, that the joint entropy 
is equal to the entropy of each variable so E^y — E^ = 0, and that finally, the correlation 
between these variables satisfies pxy — 1 = 0. In the unconstrained probability space Pcorr; 
the expectation, variance, and entropy relations of interest evaluate as 



(x) - (y) 

V{x) - V{y) 

Ex 

E 



c — b 

(c — 6) (a — d) 

- [{a + b) log(a + 6) + (1 - a - 6) log(l -a-b)] 

— [a log a + 6 log 6 + clogc +(1 — a — 6 — c) log(l 



(1.49) 
a — b — c)] . 



These functions lead to gradient relations in the Pcorr and "Pcorr spaces of: 



V [{x) - {y)] \b=c=o 
hm VKa:)-(t/)] 

(fec)-5>(00) 

V [V{x) - V{y)] |,=e=o 

V [Exy — Ex] \b=c=Q 

, Ex] 

y Pxy\b=c=0 
^Pxy 







-b + c 







lim V \V{x 

(fec)^(OO) ^ 



lim V \E,ry 

(fec)->(00) ^ ^ 



= {I - 2a)b - {I - 2a)c 

= 

7^ undefined 

= 

^ 0. 



;i.5o) 



Obviously, taking the limit (6, c) — )■ (0, 0) does not reduce the limit equations to the 
required relations. 



Independent probability spaces 

We next consider the case where the x and y variables are independent using the spaces 
Pind with isomorphism constraints or Vl^^ with limit processes. 
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When random variables are independent, then their joint probabihty distribution is 
separable for every allowable probability parameter of Pjnd or V'^^^. This means the gradi- 
ent of this separability property must be invariant across these probability spaces. That 
is, we must have P^y = PxPy and hence V [Pxy — PxPy] = 0. Similarly, separability re- 
quires we also satisfy V [{xy) — {x){y)] = 0. Further, every independent space must have 
conditional probabilities equal to marginal probabilities and so satisfy V 



Pi - P 

-^ x\y -'■ X 



0. 



Finally, two independent variables have joint entropy equal to the sum of the individual 
entropies so every independent space must satisfy V [E^y — E^ — Ey] = 0. These rela- 
tions evaluate differently in either V-md with isomorphism constraints or Pj'^^j with limit 
processes. For the square die under consideration, we have probabilities and expectations 
of 



Pxym-PM = ad -be 

{xy) - {x){y) 



Px\y{0\0) - Px{0) 



ad — be 
ad — be 

a + e ^ 



;i.5i) 



and entropies of 



Ex 

E„ 



E. 



xy 



-{a + b) log(a + b) -{1-a-b) log(l - a - b) 
-(a + c) log(a + c) — (1 — a — c) log(l — a — e) 
-a log a — b log b — e log e — d log d. 



;i.52) 



The resulting gradients are 



V [PxyiOO) - P.(0)P,(0)] \ad=bc 

lim V [PxyiOO) - P.(0)P,(0)] 

ad— s>()c 

V[{xy) - {x){y)\ \ad=hc 
lim \I{{xy) - {x){y)\ 

Px\y{m - PM\U=hc 



lim V 

ad—^bc 



PxlyiO\0) - PxiO) 



V [Exy — Ex — E, 
lim V[E^ 



y\ \ad=bc 



Err — E,i 



ad-^bc 



lim V{ad - 6c) 7^ 

ad-^bc 



lim V{ad-be) 7^ 

ad-^bc 



lim V 

ad-^bc 



ad — be 
a + e 



^ 



lim V^ 

ad-^bc 


a log 


da — ad + be 
ad — ad + be 


+ 6 log 


db + ad — be 
b d — ad + be 


clog 


"dc + 
ed- 


ad — be 
ad + be 


+ log 


d — ad 
d 


+ be 


^ 0. 



;i.53) 



+ 
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1.2.4 Entropy maximization 

The joint entropy Exy reflects the uncertainty between the x and y variables. Accord- 
ing to probability theory, this uncertainty does not include any uncertainty about which 
probability space is being chosen, while conversely, according to game theory the uncer- 
tainty between these variables increases when it includes additional uncertainty about 
which probability space is being chosen. 

We now present a numerical investigation of how to determine the maximum joint 
entropy E^y of embedded probability states featuring possibly correlated variables x and 
y as depicted in Fig. 11.31 The joint entropy is 

E^y{a, b,c) = -Y^ P^y log P:r:y (1-54) 

xy 

Using isomorphism constraints, the maximization problem is 

maxii^^,,| - (1.55) 

for all p G [—1, 1]. Here, the correlation function between x and y is given by the later Eq. 
12.111 This equation can be inverted to solve for the variable r as a function of p, g, and 
the constant correlation p, and the result ^+(p, 5', p) is given in Eq. 13.101 A numerical 
optimization then generates the maximum entropy value for every correlation state p 
with the results shown in Fig. 11.41 As expected, the presence of isomorphism constraints 
ensures the entropy ranges from a minimum of log 2 up to a maximum of 2 log 2. 

In contrast, when the joint entropy is maximized over the entire space using the tech- 
niques of game theory, then a single maximum outcome is achieved giving the maximum 
entropy in the absence of isomorphism constraints. This line is also shown in Fig. 11.41 as 
the constant at -Exy.max = 2 log 2. 

1.2.5 Continuous bivariate Normal spaces 

The above results are general. When source probability spaces are embedded within 
target probability spaces, then the use of isomorphic mapping constraints will preserve 
all properties of the embedded spaces. Conversely, when constraints are not used then 
some of the properties of the embedded spaces will not be preserved in general. We 
illustrate this now using normally distributed continuous random variables. 

Consider two normally distributed continuous independent random variables x and y 
with X, y G (— oo, oo). When independent, these variables have a joint probability distri- 
bution P^y which is continuous and differentiable in six variables, Pxy{x, p^, <^x-, V-, l^y, cfy) 
where the respective means are fix and fiy and the variances are a^ and a^. The marginal 
distributions are Px{x, fixyCTx) and Py{y, fiy,ay). In particular, we have 



xy 



1 -1 


(x-f.ix)^ , (y-My)^ 


<^i <^y 


2naxO-y 
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E(x,y) 




0.6 



-1 



-0.5 



0.5 



Figure 1.4: Maximizing the joint entropy of two correlated random variables x,y & {0, 1}. 
Without isomorphism constraints, the maximum entropy is equal to 2 log 2 (dashed line). 
However, when subject to isomorphism constraints, the simplex will exactly reproduce the 
different maximum entropy states of each of its embedded probability spaces (solid line). 



1 (x-fixr 

'2 A 



2TTcr^ 



P, 



27rcr„ 



1 (y-n) 
"2 — 71 — 

-e y 



;i.56) 



The conditional distribution for x given some value of y is 

1 



-T x|j 



1 (x-lJ-x)"^ 
'2 A 



27r ctt 



;i.57) 



These independent joint distributions can now be embedded into an enlarged distri- 
bution representing two potentially correlated normally distributed variables x and y. 
This enlarged distribution P^yix, fix, cTx, y, fiy, <Jy, p) differs from P^y in its dependence on 
the correlation parameter p^y = p with p G (— 1, 1). This distribution is continuous and 
differentiable in seven variables. The joint distribution is 



P' 



2(l-p^) 



■2 cr:rfT^i fj^ 



(7x^y 



:i.58i 



27ra^(Tj/Vl - P^ 

The marginal distributions for the correlated case are identical to those of the independent 
space so P'^ = P^ and P' = Py. The conditional distribution for x given some value of y 
is 

1 1 {x-p.x)^ 

(1.59) 



P' 

x\y 



2(l-p2) 



2nil -p^)ax 
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where the new conditioned mean is 



C.T 



/^x = IJ-x + P {y - Py 



(7„ 



(1.60) 



An isomorphic embedding requires that the unit probabihty subset of P^y be mapped 
onto the unit probabihty subset of P^y and this is achieved by imposing an external 
constraint that p = in the enlarged space. Hence, we expect P^y = Pxy. It is readily 
confirmed that when the isomorphism constraint is imposed on the enlarged distribution 
all properties are preserved, while this is not the case in the absence of the constraint. 
The gradient operator V is now a function of seven variables 



dx dy dfi. 



d ^ d ^ d ^ 

Px + -^—P'y + 7^ (^x + 



^^iy 



da. 



d ^ 



dp 



;i.6i) 



The probability distributions must satisfy a number of gradient relations, but we have: 



P' 

xy 



P'P' 

^ y 



p=0 



lim V 

p-5>0 



P' -P'P' 

xy X y 



d 

p\im—-P' ^ 

P^o dp '-y 



V 



P'x\y 



P' 



lim V 

p-5-0 



P' 

Ay 



p=0 

PL 







d 



p lim --PL 7^ 0. 
P^o dp '^ 



1.62) 



Similarly, the expectations of functions of the x and y variables must also satisfy a number 
of gradient relations. As expectations integrate over the x and y variables, the gradient 
operator is a function of only five variables now. 



dp. 



d ^ d ^ d ^ 

Px + -^—py + 7^ — (^x + 



dpy 



da^ 



d . d ^ 
-y + g-pP- 



da. 



1.63) 



We have 



v[{xyy~{xy{yy]i^, 

limV[{xyy-{xy{yy] 

p— >u 







d 



plim 7— (xu)' 7^ 0. 



1.2.6 Quantum probability spaces 

As noted above, the use of isomorphic mappings to preserve the properties of probability 
spaces is general. As a last illustration, we show the use of isomorphic mappings when 
applied to quantum probability spaces. 

Suppose a quantum probability space is to be embedded within another enlarged 
quantum probability space. (See [Hj for an overview of quantum information theory 
including quantum information geometry.) An A^ level quantum system has von Neumann 
entropy defined as 

Ejv = -tr^jvlog^Tv (1-64) 
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where here R^ is the quantum density matrix and tr indicates a trace operation apphed 
to a matrix. Supposing that matrix D diagonahzes the density matrix so DRjsiD'^ is 
diagonal, and that its eigenvalues are Aj for 1 < i < A^, we have 

N 

^iv = -E^^log^- (1-65) 

The eigenvalue Aj specifies the occupancy probability of the i*^ level. Hence, maximizing 
the iV-level system entropy requires that Aj = 1/A^ for all i. Consequently, a two level 
quantum system maximizes its entropy E2 when the density matrix is an equiprobable 
mixture equal to half of the two level identity matrix, R2 = I/2/2, while a three level 
quantum system maximizes its entropy E^ when the density matrix is an equiprobable 
mixture of R3 = I/3J3. 

Now, if the two level system were isomorphically embedded within a three level system, 
then the two level system entropy E2 is properly maximized only when isomorphism 
constraints are used to decouple the third level so that it plays no part in the optimization. 
This is achieved by using an isomorphism constraint A3 = to decouple and remove the 
third level from the system. That is, the optimization taking account of an isomorphism 
constraint V3-E'3|a3=o = will determine the correct maximum value for E2. However, 
a failure to use an isomorphism constraint will locate an incorrect maximum point via 
limAg^-o V3-E3. We have 

V2E2 = V3^3|a3=o ^ lim V3^3. (1-66) 

A3--!>0 

Isomorphism constraints must be used to properly embed one quantum probability space 
within another. 

1.2.7 Perfect correlation reduces dimensionality 

Standard probability theory holds that when two variables x and y are known to be 
perfectly correlated, then P{x, y) = P{x)P{y\x) = P{x). That is, any optimization which 
involves the joint distribution P{x,y) does not involve two dimensions but only one as 
X = y. Perfect correlation reduces dimensionality which alters the gradient operators 
which in turn can alter optima. 

Probability theory takes account of this dimensionality reduction when using Affine 
variable transforms. Typical presentations of probability theory hold that "any two 
real-valued random variables x and y whose mean values and variances exist may be 
represented as an Affine transformation of a pair of uncorrelated random variables" [15j . 
Such statements, carelessly interpreted, would indeed suggest that perfect correlations 
involve no reduction in the number of variables. Writing the respective mean values as 
(x) and (y), and defining the translated variables 

X* = X — (x) 
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Figure 1.5: (a) An affine transformation of correlated variables x and y generates new 
orthogonal variables u = x + y and v = x — y which are uncorrelated. (b) When x and 
y are perfectly correlated, f = and u is the only free variable and dimensionality is 
reduced. Optimization solutions must lie on the u-axis satisfying the constraint x = y. 



y* = y-{y), 

then an affine transformation can always be used to define two new variables 



(1.67) 



u = X + y 
V = X — y 



These variables each have mean zero, (u) = (v) = 0, and are uncorrelated as 

cov(u, v) = (uv) = 0. 



:i.68) 



:i.69) 
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The zero covariance results from the orthogonahty of the random variables u and v in 
a suitable L^ vector space, while the possibly correlated original variables are generated 
from the inverse afiine transformation 



(T. 



X 



y 



a^x* + (x) = —{u + v) + (x) 



a-,, 



^yy* + {y) = Y'^u-v) + {y), 



;i.7o) 



where here, a^ is the standard deviation of variable z G {x, y}. 

If the X and y variables are perfectly correlated, then v is identically zero and u is the 
only surviving variable. Perfect correlations reduce the dimensionality of the optimization 
space and probability theory preserves the dimensionality of perfectly correlated variables 
when using Affine transforms. (See Fig. 11.51 ) 

A similar preservation of dimensionality occurs in the Hotelling transform, a discrete 
version of the Karhunen-Loeve transform |16J. This transform can also be used to map 
the probability space of two uncorrelated centered variables {u, v) into the probability 
space of two correlated centered variables (x, y). If the state of correlation between x and 
y is p, then the Hotelling transform is implemented via 



X 

y 



1 







P vr^7 



u 

V 



;i.7i) 



Then, whenever the x and y variables are not perfectly correlated both the [u, v) and (x, y) 
probability spaces are two dimensional. However, when p = 1 and x and y are perfectly 
correlated, then the mapping matrix becomes singular and non- invert ible ensuring that 
X = y = u so that the x and y probability space is one dimensional even while the u 
and V probability space is two dimensional. Probability theory again acts to preserve the 
dimensionality of the joint probability space of perfectly correlated variables. 

1.2.8 Example isomorphic functions 

There are different ways to embed a smaller source function within an enlarged target 
function which can preserve different amounts of the structure of the source function 
within the target function. Consider for example, mapping a 1-dimensional function 
/(x) into a 2-dimensional function g{x,y) along the line y = x so that /(x) = g{x,x). 
One way to implement this assignment is to use limit processes constraining most of the 
neighbourhood of g{x, y) in the vicinity of the line y = x to satisfy 



\i-mg{x,y) = /(x). 



y^x 



;i.72) 



Another way to do this is to ignore the values of g{x,y) away from the line y = x and 
simply use externally imposed constraints forcing the assignment on the line via 



9{x,y)\ 



y=x 



/(^)- 



;i.73) 
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This approach does not care about values g{x, y) when x ^ y. The question then is, under 
what circumstances can lim.y^xg{x,y) or g{x,y)\y=x be used to examine the properties 
of/(x). 

Hereinafter, for concreteness we will consider the simplified example functions f{x) = 
x"^ and g{x,y) = xy. Each of the implementations, liray^^ g{x,y) or g{x,y)\y=x, have 
different domains (dom) in each space, and hence different integration volume elements 
(dv) 

fix) \imy^^g{x,y) 9ix,y)\y=^ 



dom 

dv 



3? 3? X 3? ^ (1.74) 

dx dx dy dx. 



The different dimensionalities of the domains impacts on any attempt to change vari- 
ables within each space. The rank of the change of variable transforms (A) and the 
dimensionality of the Jacobian matrices (J) in each space are 

fix) \imy^^gix,y) gix,y)\y=a: 



rank (A) 
dim (J) 



1 2 1 (1.75) 

1 2 1. 



These differences impact on the evaluation of other properties such as gradients, which 
should evaluate as 

V/(a;) = 2xx (1.76) 

where a hatted variable denotes a unit vector in the indicated direction. In contrast, the 
gradient evaluated using a limit assignment gives 

Vgix, x) = hm Vgix, y) = x(x + y), (1-77) 

y }-x 

which does not satisfy the required relation. Conversely, the use of an externally imposed 
constraint ensures 

Vgix, y)\y=x = ^gix, x) = 2xx (1.78) 

as required. 

In summary, the definitions 

fix) = gix,y)\y=a, = \imgix,y), (1.79) 

y rX 

do not generally carry over to the gradient relations, as 

Vfix) = Vgix,y)\y=,, ^ hmVgix,y), (1.80) 

This results as the limit process fix) = lim.y^xgix,y) treats the x and y variables as 
being independent and simply evaluates desired quantities at points (x, y) lying on the 
line y = X. In contrast, the constraint fix) = gix,y)\y=x enforces a functional relation 
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between the x and y variables which preserves all the structures of f{x) within g{x,x). 
It is well understood that any functional relation between the variables of a function will 
impact on the properties of that function. Such functional relations must be preserved 
whenever that function is mapped into a different space. The need to take account 
of such functional relations is a standard part of routine optimization techniques such 
as differentiation via any of the chain rule, Lagrangian multipliers, or directed vector 
gradients. 

A number of standard techniques exist for evaluating the gradient f'{x) using the 
constrained function g{x,y)\y=x- For instance, the chain rule can be applied to the 
functions g{x,y) and y{x) = x giving 

f(x) = ^ + ^^ 
dx dy dx 

= 2xx. (1.81) 

Another common alternative is by using Lagrange multipliers in which f'{x) = L'{x) 
with 

L{x,y,X)=xy-X{y-x) (1.82) 

and 

^ = (^ + ^)- 

^— = {x-y)\. (1.83) 

Equating the last two lines to zero gives the required constraints y = x and X = x 
ensuring f'{x) = L'{x). A final way to perform this constrained optimization is to use 
directed vector gradients where 

f'{x) = lim Vg{x, y).v.V2 (1.84) 

y ^^ 

with V = {x + y)/\/2. Here, v is normalized and the extra factor of \/2 properly calculates 
changes in the x direction. This gives the magnitude of the gradient as f'{x) = 2x as 
required. 

There are two ways to embed the function f{x) within the surface g{x, y) using either 
a limit process or an externally imposed constraint. The limit process fails to preserve 
many of the properties of the source function within the target function. Conversely, the 
external constraint does ensure that all source function structures are preserved within 
the target function — dimensionality, gradient, and so on. In general, it is not possible 
to embed a smaller space within a larger space and preserve gradients and optimization 
outcomes without the use of constraints. These constraints reflect the use of isomorphic 
mappings to preserve the properties of the source space with the target space 
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Figure 1.6: A schematic representation where a three dimensional target probability strat- 
egy space {p, q, r) embeds respectively several one dimensional probability spaces associated 
with perfectly correlated variables (lines, upper left and lower right), and a two dimen- 
sional probability space associated with independent variables (plane, middle). An exact 
isomorphism preserves the respective original tangent spaces shown via one and two di- 
mensional axes offset in background. A weak isomorphism fails to preserve the original 
tangent spaces of the source probability distributions and assigns the three dimensional 
tangent space of the target space to every embedded distribution (as shown in foreground 
slightly offset from each embedded space). 
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1.3 Isomorphisms and Optimization 

There are two approaches to optimization over probabihty spaces presented here. Proba- 
bihty theory uses isomorphic constraints to exactly preserve the properties of embedded 
probabihty spaces and then compares these exactly calculated values. Game theory es- 
chews the use of isomorphic constraints and in effect, argues that any uncertainty about 
which probability space to choose bleeds into many calculations within a given space and 
alters the calculated outcomes. 

When probability spaces are represented as geometries, then it is expected that at 
least some of the properties of the probability space will be rendered in geometric terms. 
How these geometrical properties are preserved when a probability space is embedded 
within another is the question. Probability theory requires the exact preservation of all 
properties of every source space and this is achieved by imposing different constraints on 
different points within the target space. Game theory in contrast, imposes a single target 
space geometry onto every source probability space. One way to picture this is shown in 
Fig. 11.61 This figure shows how probability theory exactly preserves the dimensionality 
and tangent spaces of embedded probability spaces, while game theory overwrites these 
properties of the embedded spaces with the corresponding properties of the mixed space. 

In probability theory, the different isomorphism constraints and tangent spaces acting 
at each point define non-intersecting lines and surfaces within the target space. Some 
of these are shown in Fig. 11.71 representing the {p, q, r) simplex of the two potentially 
correlated x and y variables (this behavioural space is defined in the next Chapter). Here, 
each state of correlation is a constant and cannot vary during an optimization analysis so 
an optimization procedure must sequentially take account of every possible correlation 
state between these variables, setting p^y = p for all p G [— 1, 1]. These optimum points 
can then be compared to determine which correlation state between x and y returns the 
best value. 

Unsurprisingly, these two distinct approaches can sometimes generate conflicting re- 
sults. 

1.3.1 Isomorphism constraints alter geometry 

In general, the imposition of any specific isomorphism constraint can be expected to alter 
the geometry of optimization space and alter optimization outcomes. We now illustrate 
this briefly. 

Consider a three dimensional volume in which Pythagoras's rule specifies the distance 
ds between points {x, y, z) and (x + Ax, y + Ay, z + A^;) as 

ds^ = dx^ + dy"^ + dx^ . (1.85) 

That Pythagoras's rule is satisfied indicates that the space is fiat. In contrast, when some 
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Figure 1.7: Every point within the {p,q,r) probability space shown specifies a particular 
state of correlation Pxy{p,(l,i^) between the x and y variables. We show here several 
lines and surfaces of constant correlation taking values from top left to bottom right of 
Pxy = +1, +0.75, +0.25, 0, —0.25, —0.75, —1. The optimization of expectations at any 
point (p, q, r) must take account of correlated changes between x and y. 

constraint is adopted via z = f{x,y) then the shortest distance between two points no 
longer satisfies Pythagoras's rule indicating that the constraint has rendered the space 
curved. Consider the example relation 



2 2 2 2 

z = r — X — y , 



;i.86) 



where r denotes a radius of curvature. The surface constraint now requires 



zdz = —xdx — ydy, 



(1.87) 
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so 



dz' 



{xdx + ydyf 



:i. 



r^ — x^ — y^ 
In turn, this gives the shortest path distance between {x, y) and {x + Ax, y + At/) as 



ds^ 



dx^ + dy"^ + dz^ 



x^ 



;i.89) 



x 



r"^ — x"^ — y"^ 



dx^ + 



x"^ — y"^ 



dy^ + 



2xy 



r"^ — x"^ — y"^ 



dxdy. 



Self-evidently, this shortest distance between the points (x, y) and {x + Ax, y + Ay) does 
not satisfy Pythagoras's rule reflecting the fact that the space is now curved. 

The adoption of a curvature imposing constraint ensures that optimization problems 
(the shortest path distance) within the plane are altered and so locate different optima. 
Further, theorems valid in flat space are no longer applicable in the now curved space. 
When it is possible to impose curvature inducing constraints on a space to alter opti- 
mization outcomes, then it is necessary to examine every possibility to ensure a complete 
optimization. 



1.4 Discussion 

A rational player must compare expected payoffs across the mixed strategy space in order 
to locate equilibria. As expectations are polylinear, such comparisons are mathematically 
equivalent to calculating gradients and the issues raised in this paper apply. Further, it 
is perfectly possible that a rational player might need to calculate the Fisher information 
defined in terms of gradients of probability distributions in order to optimize payoffs. It is 
perfectly possible that a rational player might well need to optimize an Entropy gradient 
to maximize a payoff. It is even perfectly possible to define games where payoffs depend 
directly on the gradient of a probability distribution — shine light through a sheet of glass 
painted by players to alter transmission probabilities and make payoffs dependent on the 
resulting light intensity gradients (call it the interior decorating game). We have shown 
that rational players working with the standard strategy spaces of game theory will have 
difficulties with these games. 

We have highlighted two alternate ways to optimize a multivariate function n(x, y) 
where x and y might be functionally related in different ways, y = gi{x) for different i 
say. The first approach, common to probability theory and general optimization theory, 
considers each potential functional relation as occupying a distinct space and approaches 
the optimization as a choice between distinct spaces. Any uncertainty about which space 
to choose does not leak into the properties of any individual space. If desired, isomorphic 
constraints can be used to embed all these distinct spaces into a single enlarged space 
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for convenience, but if so, all the properties of the optimization problem are exactly 
preserved. The second approach, common to game theory, holds that the uncertainty 
about which functional relation to choose should appear in the same space as the variables 
{x,y). This is accomplished by expanding the size of the space to include both the old 
variables x and y and sufficient new variables (not explicitly shown here) to contain 
all the potential functional relations and allow limy^g.(^x^Il{x,y) = Il[x,gi{x)] for all i. 
This enlarged space then allows gradient comparisons to be made at points n[a;,5'j(x)] — 
n[x, (^^(x)] for all i and j to locate optima. These two approaches can lead to conflicting 
optimization outcomes as while these approaches generally assign the same values to 
functions at all points, 

n(a;,y)|^=^^(,)= lim n(a;,2/), (1.90) 

they typically calculate different gradients at those same points 

^^{x,y)\y=g^ix)7^ lim VU{x,y). (1.91) 

These differences can be extreme when the function Il{x, y) depends on global properties 
of the space — the dimension, volume, gradient, information or entropy say. In its ap- 
proach, game theory differs from many other fields in how it models alternate functional 
dependencies including other fields of economics. For example, the Euler-Lagrange equa- 
tions of Ramsey-type models consider the functional variation of some function u while 
ensuring a consistent treatment of the gradient of the function u' [18j. Gradients are not 
taken in any limit in these fields. 

Throughout this work, we have presumed that a rational player should be able to 
use standard techniques from either probability theory or optimization theory on the one 
hand, or decision theory and game theory on the other, and expect all of these methods 
to provide consistent results. We have shown that when considering multiple, poten- 
tially correlated variables, and functions of these variables dependent on the geometry 
of the probability parameter space, then these methods can give rise to contradictory 
optimization outcomes. We have suggested decision and game theory are incomplete 
when they require the adoption of a single geometry for any decision or game tree, and 
that these fields should consider applying the alternate geometries of probability theory 
and optimization theory. Recognizing that a single multi-stage decision or game tree can 
encompass an infinite number of incommensurate probability spaces might resolve some 
of the paradoxes of game theory, and have broader application. 

The specification of a probability space determines which variables exist and whether 
they are functionally constrained or freely varying. Given the choice of a probability 
space, optimization can only take place with respect to the freely varying parameters 
within that adopted space. Should players wish to explore a broader range of variation, 
then they must seek to alter the functional assignments of some of their random variables 
and functions, and so will alter their probability spaces. In other words, rational players 
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of unbounded capacity will search both among different probability spaces, which are not 
always guaranteed to give the same outcomes, as well as search within each space over 
all of the freely varying parameters of each probability space. Rational players require 
a decision procedure mediating this dual search of all possible probability spaces and all 
possible variables within each space, and that is what we seek to provide here. 

Every probabilistic decision can be modeled by an infinite number of different prob- 
ability measure spaces. For many decisions, it is immediately obvious that every alter- 
native space leads to exactly the same optimized outcomes. The question is, is this true 
for every possible decision, for every possible strategic interaction. Before turning to an- 
swer this question, we now turn to examine the probability spaces typically encountered 
in game theory. In particular, we focus on mixed strategy probability measure spaces, 
behavioural strategy probability measure spaces, and correlated equilibria probability 
measure spaces. 

1.5 Appendix: Correlation and mutual information 

We employ probability space isomorphisms based on correlation. However, it is not clear 
that correlation is the appropriate measure to use. It is well known that this measure of 
linear correlation is insensitive to nonlinear correlations. Because of this, other measures 
might be more useful. When two variables are correlated, and if this correlation is ignored, 
then information has been discarded. It might well be the case that information based 
measures, in particular, mutual information might provides a better way to take account 
of the interrelatedness of random variables [15] . 



1.5.1 Nonlinear dependencies and correlation 

The correlation between arbitrary random variables x and y is 

_ cov(x, y) _ {xy)- {x){y) 

Px,y I I 5 yi.JZ) 

^{x') - {x^^iy^) - {yY 
defined in terms of the covariance cov(x,y), the variance a^. = cov{x,x), and the mean 

(x) m- 

Consider two discrete random variables x and y, with x being any of x G {—1,0, 1} 
with equal probability |, and y = x"^ E {0, 1} so P{y = 0) = | and P{y = 1) = |. These 
variables would normally be considered to be highly correlated as knowing x immediately 
specifies y, while knowing y narrows the possible values of x to x = ^^/y■ The respective 
probability distributions are 
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P{x) = Y^P{x,y) 

p{y) = J2 Pi^^y) 

x=-l 

= 3 i^yfi + 25j,,i) 

P{x\y) = 6:,:fi6yfi + -6y^i {6x,-i + d^^i) 

P{y\x) = {S^^-i6y^i + 6cc,oSyfi + Sx,i6y^i) . (1.93) 

These distributions then give 

cov{x,y) = {xy)-{x){y) 
1 1 
= H ^P{x,y)xy 

x=—\ y=0 

= 0. (1.94) 

This zero covariance then specifies a zero coefficient of hnear correlation p^y = 0, but 
as noted above, this does not mean these variables are uncorrelated. Better measures of 
correlation indicate this. 

1.5.2 Mutual Information 

A more general measure of the interrelatedness of discrete variables is given by their 
mutual information [20j. This is defined in terms of their joint probability distribution 
Pxy, the marginal distribution Px governing the x variable, and the marginal distribution 
Py governing the y variable. The information obtained from observing a single instance 
of a discrete random variable x is 

I{x) = -\ogP{x). (1.95) 

Consequently, the average information content of an entire ensemble of observations of x 
is obtained by averaging over the entire distribution to give the entropy or uncertainty 
of X, 

H{x) = -Y,Pix)logP{x). (1.96) 

X 

Suppose now that a second discrete random variable y is observed. In line with the above, 
the joint entropy or uncertainty of x and y is 

Hix,y) = -Y,Pix,y)\ogP{x,y). (1.97) 

x,y 
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Consider now how much information we obtain about x given observations of y. The 
information obtained about x given knowledge of y is — log P{x\y), which when averaged 
gives a measure of the remaining uncertainty in x given an observation of y. This is the 
conditional entropy of x given y defined as 

H{x\y) = -J2P{x,y)\ogP{x\y). (1.98) 

x,y 

Consequently, the average reduction in uncertainty in x given observations of y is the mu- 
tual information content of the joint probability distribution describing the two discrete 
random variables x and y, and is 

H{x;y) = H{x)-H{x\y). (1.99) 

Then, when variables x and y are uncorrelated, we have P{x, y) = P{x)P{y) and P{x\y) = 
P{x), so H{x\y) = H{x), ensuring their mutual information is minimized at H{x;y) = 
0, while their joint entropy or uncertainty is maximized at H{x,y) = H{x) + H{y). 
Conversely, when these variables are perfectly correlated, then P{x,y) = P{x)P{y\x) = 
P{x)Syrc and P{x\y) = 1, so H{x\y) = 0, ensuring their mutual information is maximized 
at H{x;y) = H{x), while their joint entropy or uncertainty is minimized at H{x,y) = 
H{x) [20j. 

For the example considered above, we have the entropies or uncertainties in the re- 
spective X and y distributions of 

H{x) = log 3 

H{y) = log3-^log2. (1.100) 

That is, there is less uncertainty in y as there are only two possible values taken by y 
compared to the three possible values taken by x. Subsequently, the respective conditional 
entropies are 

Hix\y) = ^log2 

H{y\x) = 0. (1.101) 

The difference between these conditional entropies results as knowing x uniquely specifies 
y while knowing y only partially specifies x. We can now calculate the mutual information 
content x and y which is 

H{x;y) = H{y;x) =\og3-hog2. (1.102) 

Lastly, the joint entropy or uncertainty of x and y is 

H{x,y) = H{y, x) = \og3. (1.103) 
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For the behavioural strategy distributions considered in this paper, we have 



[{l-qy-iqif-P[{l-ry-rr-]P 
[I — q — p{r — q)] ^'^^p^^^I' [g -|- p(^r — q)]'' 



H.y = log ^^^""^ /\ V~^ ^ .r ^ • (1-104) 



When q = r indicating that x and y are uncorrelated, we have a mutual information 
content of Hy,ji. = 0. Conversely, when (g, r) = (0, 1) and x and y are perfectly correlated, 
the mutual information content is 



H^.y = H{x) 



= — [(1 — p) log(l — p) +plogp] . (1.105) 

Similarly, when (g, r) = (1,0) and x and y are perfectly anti-correlated, the mutual 
information content is 

H^.y = H{x) 

= — [(1 — p) log(l — p) +plogp] . (1.106) 

This duplicates the value for the perfect correlation case. 

The case of continuous distributions is more complicated, where for instance, the 
mutual information content evaluates as 

H{x- y)= jdx jdy P{x, y) log (-^^^\ . (1.107) 

The upshot is that correlation corresponds to information. Every different probability 
space that might be adopted by each player corresponds to a physical randomization 
device, a "roulette" , which defines certain correlations between random variables. These 
correlations correspond to information, and should the correlations be ignored, then 
this equates to the discarding of information. In this paper, we assume that rational 
players will make use of all available information including that implicit in correlated 
joint probability measure spaces. 

Problem: Mutual information 

However that the mutual information is not a constant when x and y are perfectly 
correlated or anti-correlated. It is not clear how mutual information might be used, but 
then again, it is not clear why correlation should have the status desired for it. What is 
the connection between the functional dependencies of our deterministic examples, and 
correlated variables? 
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Chapter 2 

Isomorphisms in Strategy Spaces 



2.1 Introduction 

The preceding chapter has pointed out by example that there are different ways to "con- 
tain" one probabihty distribution within another. Probabihty theory uses strong isomor- 
phic mappings, while game theory uses weaker isomorphic mappings which preserve fewer 
properties of the original distribution within the target space. These differences arose 
(perhaps) as probability space isomorphisms do not feature anywhere in the historical 
definition of mixed strategy spaces. We briefly recap this historical process below. 

2.1.1 Mixed strategy probability measure spaces 

Rationality, Utility: Von Neumann and Morgenstern began their formalization of game 
theory by defining the economic problem as when "rational players" seek to "obtain a 
maximum of utility" using "a complete set of rules of behavior in all conceivable sit- 
uations." [1]. Naturally, the result "is thus a combinatorial enumeration of enormous 
complexity" [1] . Von Neumann and Morgenstern aimed to formulate a complete plan, an 
analysis of every possible move or variable or outcome" [1]. 

Moves: Each player makes moves in a game, where "A move is the occasion of a 
choice between various alternatives" at each stage of the game pj. 

Pure Strategies: The choices of moves combine into player strategies: "A strategy 
of the player A; is a function . . . which is defined for every [personal move of that player], 
and whose value [determines his choice at that move]" [1]. A strategy is "a complete plan: 
a plan which specifies what choices [a player] will make in every possible situation, for 
every possible actual information which he may possess at that moment" [1]. Hence, for 
von Neumann and Morgenstern, each different strategy for a given player is a list of all 
the combinatorial play possibilities available to that player throughout the game taking 
account of every different possible history and information set in the game. Each player 
chooses their strategy independently of all the other players, as any dependencies and 
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correlations are already taken into account in the complete listing of information sets and 
possibilities for every possible game that might occur. In particular, "The player k must 
choose his strategy . . . without information concerning the choices of the other players, 
or of the chance events (the umpire's choice). This must be so since all the information 
he can at any time possess is already embodied in his strategy" pQ. The choice of a 
strategy of play then becomes the sole decision to be made by the player, and this is 
made independently of any other choice. 

Mixed Strategies: Players can choose their pure strategies according to some inde- 
pendent probability distributions, termed a mixed strategy. The probability parameters 
of each distribution are subject to normalization constraints "and to no others" [T]. 

Nash Equilibria: Nash closely followed the von Neumann and Morgenstern formal- 
ism [2113]. Nash's famous first paper commences "One may define a concept of an n-person 
game in which each player has a finite set of pure strategies and in which a definite set 
of payments to the n players corresponds to each n-tuple of pure strategies, one strategy 
being taken for each player. . . . For mixed strategies, which are probability distributions 
over the pure strategies, the pay-off functions are the expectations of the players, thus 
becoming polylinear forms in the probabilities with which the various players play their 
various pure strategies." pj. In a second paper, Nash treated the mixed strategy space 
as "points in a simplex whose vertices are the [pure strategies]. This simplex may be 
regarded as a convex subset of a real vector space, giving us a natural process of linear 
combination for the mixed strategies" ^ . Nash subsequently defined the set of all mixed 
strategies for all players as "a point in a vector space, the product space of the vector 
spaces containing the mixed strategies. And the set of all such [points] forms, of course, 
a convex polytope, the product of the simphces representing the mixed strategies" 0. 
Because all the mixed strategy probabilities are continuous, Nash was able to use fixed 
point theorems to derive optimal points, referred to now as Nash equilibria. 

Behavioural strategy spaces: Kuhn showed that the mixed strategy spaces could 
be replaced by the more intuitively accessible behavioural strategy space P]. The be- 
havioural strategies are merely the player's choice probabilities distributed over each 
branch of a game's decision tree. These probabilities are 'uncorrelated' or 'locally ran- 
domized' strategies wherein a local perspective decentralizes the strategy decision of each 
player into a number of local decisions [HET]. In this, the agent-normal game form, my- 
opic agents at each history set determine paths through the game tree using probability 
distributions which are uncorrelated and independent. This assumption allowed Kuhn to 
prove the equivalence of uncorrelated behavioural strategies and the uncorrelated mixed 
strategies introduced by von Neumann and Morgenstern [1] and Nash [3] in games of 
perfect recall |1]. 

Absent isomorphisms: In the historical development painted above, there is no 
room for isomorphic mappings and any discussion of the properties of embedded prob- 
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ability distributions. A game definition provides a complete list of moves and hence of 
strategies and hence of mixed strategies which are independent and unconstrained (and 
complete). Our alternative approach posits that a game definition can be put into a 1-1 
correspondence with many alternate probability spaces, with each choice of probability 
space altering the complete list of moves and of strategies and hence of mixed strategies. 
In this chapter, we show that these two different approaches lead to very different 
properties for mixed and behavioural strategy spaces as defined by probability theory 
and game theory. 



jc; 




y: 10 



Figure 2.1: A simple decision tree where potentially independent or correlated variables x 
and y take values {0, 1} with the probabilities shown. This defines the {p, q, r) behavioural 
probability space. 



2.2 Mixed and behavioural strategy spaces 

The different approaches of probability theory and game theory to isomorphic embeddings 
impacts on the definitions of mixed and behavioural strategy spaces. As previously, we 
will compare these spaces both with and without isomorphism constraints. Our focus 
will be on a simple decision problem involving two random variables x,y & {0,1} where 
y is potentially conditioned on x as shown in the behavioural strategy decision tree of 
Fig. O 



2.2.1 Mixed strategy space V_ 



M 



The mixed strategy space is denoted Vm, and determines the choice of x via a probability 
distribution a while the respective choices of y on the left branch of the decision tree yi 
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and on the right branch yr are determined by an independent probabihty distribution /3 
according to the following table: 



(2.1) 



The mixed strategy simplex for each player is respectively S-^ = {(ao, ai) G R\_ : J2j (^j = 
1} and S^ = {(/3o, /3i,/32, /^s) ^ -R+ '■ Hj Pj = 1}- The associated tangent spaces are 
T^ = {z E R"^ : J2j Zj = ^} and T^ = {z E R"^ : J2j ^j = ^} , equivalent to every possible 
positive or negative fluctuation in the probabilities of the pure strategies of each player. 
The joint probability distribution Pxy{x,y) for x and y is 



iyi,yr) = 


(0,0) 


(0,1) (1,0) 


(1,1) 


{x,y) 


/3o 


/3i /32 


/33 




(0,0) 
(1,0) 


(0,0) (0,1) 
(1,1) (1,0) 


(0,1) 
(1,1). 



'.,(0,0) = 


= (l-ai)(l-/32-/33 


'..(0,1) = 


= {l-a,){(32 + (3s) 


'xy(l,0) = 


= a,{l - /3, - /3s) 


'..(1,1) = 


= a,{/3, + /3s). 



(2.2) 

Here, we have used normalization constraints to eliminate ao and Po- The expectations 
of the X and y variables are given by 

(x) = «! 

(y) = /32 + /33 + «i(/3i-/32) 
{xy) = ai(/3i + /33), (2.3) 

while their variances are 

V{x) = Q;i(l — ai) 

V{y) = [/32 + /33 + ai(/3i-/32)]x[l-/32-/33-ai(/3i-/32)]. (2.4) 

For completeness, we note the marginal and joint entropies are 



Ex 
E,, 



E, 



xy 



-(1 — ai) log(l — ai) — ai logai 

-[1 - /32 - /33 + ai(/32 - A)] X log[l - /32 - /33 + ai(/32 - A)] 

-[/32 + h- ai(/32 - A)] X log[/32 + h- ai(/32 - A)] 

-(1 - ai)(l - /32 - /33) log[(l - ai)(l -h- P^)] 

-(1 - ai)(/32 + /33) log[(l - ai)(/32 + h)] 

-ai(l-/3i-/33)log[ai(l-/3i-/33)] 

-ai(/3i + /33)log[ai(/3i+/33)]. 



(2.5) 
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Naturally, the mixed strategy probability space can model any state of correlation be- 
tween X and y with the correlation give by 

f R R a\ \/Q^i(1-"i)(/^1-/^2) 

Pxy{ai, Pi, /32, Ps) = /, ^ ^ • (2.6) 

Then, when x and y are perfectly correlated we have p^y = 1 requiring the constraints 
/3i = 1 and f3o = (^2 = f^s = 0. When x and y are perfectly anti-correlated we have 
Pxy = — 1 requiring the constraints /32 = 1 and /3o = /^i = /^s = 0. Finally, when x and y 
are independent we have p^y = requiring the constraint Pi = ^2- 

2.2.2 Behavioural strategy space Vb 

The behavioural strategy probability space [1] is denoted Vb and is parameterized as 
shown in Fig. 12.11 The behavioural strategy space for the players is S-^^ = {{p,q,r) G 
R'\. '. < p,q,r < 1} after taking account of normalization. The associated tangent space 



values is 



is T"^^ = {2; G R^}. The probability Pxy{x,y) that x and y take on their respective 



P,,(0,0) = (l-p)(l-g) 

P,.,(0,1) = (l-p)g 

P,,(1,0) = p(l-r) 

P,.,(l,l) = pr. (2.7) 

This distribution gives the following expected values: 

(x) = p 

(y) = q + p{r-q) 
(xy) = pr, (2.8) 

while the variances of the x and y variables are 

V{x) = p{l — p) 

Viy) = [q + p{r - q)][l - q - p{r - q)] . (2.9) 

The marginal and joint entropies between the x and y variables are 

Ex = -{1 -p)log{l -p) -plogp 

Ey = -[{l-p){l-q)+p{l-r)]x\og[{l-p){l-q)+p{l-r)] 

— [(1 — p)q + pr] log[(l — p)q + pr] 
E.y = -{l-p){l-q)\og[{l-p){l-q)]-{l-p)q\og[{l-p)q] 

— p(l — r) log[p(l — r)] — pr\og[pr]. (2-10) 
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The behavioural probabihty space also allows modeling any arbitrary state of correlation 
between the x and y variables where the correlation between x and y is 

Pxy = I (2.11) 

^[q + p{r - g)] [1 - g - p{r - q)] 

Then, x and y are perfectly correlated at pxy{p,Q,l) = 1, perfectly anti-correlated at 
Pxyip, 1, 0) = — 1, and uncorrelated if either p = Oorp = lorg = r giving p^y = 0. Hence, 
the decision tree of Fig. 12.11 encompasses every possible state of correlation between x 
and y, and thus it can be used to perform a complete analysis. 

2.2.3 Isomorphic Mixed and Behavioural Spaces 

The mixed Vm and behavioural Vb strategy spaces contain embedded probability spaces 
where x and y are respectively perfectly correlated, independent, or partially correlated. 
As previously, we will now perform a comparison of probability spaces, both with and 
without isomorphic constraints, for various correlation states between the x and y vari- 
ables. That is, we will compare the mixed strategy space Vm and behavioural strategy 
space Vb with isomorphically constrained mixed and behavioural strategy spaces as in- 
dicated using the following notation. 

The case of perfectly correlated x and y variables is modeled by the spaces 



(2.12) 







hm;3^^i Vm 

lim(q,r)-s.{0,l) T^M 
^B (q,r) = iO,l) 


mixed 

constrained mixed 
behavioural 
constrained behavioural 


In these spaces we expect all of the following to hold: 


• V [P^y 


(0,0) + P,,(l,l)] = 


= 0, 




• V [P,y 


(0,1) + P.,(1,0)] = 


= 0, 




• V [P.| 


,(0 0); = 


= 0, 






• V [P^i 


,(0 1); = 


= 0, 






• V[{x) 


- (y)] = 


= 






. V [(x) 


- {xy)' 


= 






• V [{y) 


- {xy)\ 


= 






• V[V{x 


-y)] = 


V [V{x) + 


V{y)~ 


2cov(a;, y)] = 
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Pxy = 1 




Vm 




Vb 


^mI^I, 


=1 


'^Bl(9,r) = ((),l) 


Parameters 




ai,/3i,/32,/33 




p,q,r 


ai 




P 


Dimensions 




4 




3 


1 




1 


V operator 


al,^^ 


■ + 9|t/3i + a§j/32 + 


a'3^3 


iP+^9+^r 


afr^i 


^P 


Gradient 




lim^^^iV(.) 




lim(q.r)^(0,i) V(.) 


V 




V 


Probability Conservation 
















^[Pxy(0,0) + Pa:y(l,l)] 


ai/3i 


-(l-ai)/32 + (2Qi- 


-l)/33 


-(l-p)g + pf 










^[P.y{0,l) + P.y{l,0)] 


-ai/3i 


+ {l-Qi)/32- (2qi 


-l)/33 


(1 - p)ij - pr 










Conditionals 
















VP,|,{0|0) 




T^(/^i+/33) 




^/ 










VP,|j;(0|l) 




^(/32+/33) 




^1 










Expectations 
















V(x> 




Ai 




P 


di 




p 


V(?/> 


ai - 


f ai/3i + (l-ai)/32+/33 


p + (l -p)<j + pr 


di 




p 


V{xy) 




ai +ai/3i +ai/§3 




p + pf 


di 




p 


Variance 
















\'[V{x) + V{y)~2cov{x,y)] 


-Ql/3l 


+ (l-ai)/32 + (l-: 


2ai)/33 


(l-p)(J-pf 










Entropy 
















V [Exy ~ E.x\ 




7^0 




7^0 










Correlation 
















^pxy 




7^0 




7^0 










Pxy = 




Pm 




Ps 


PmU,. 


--07. 


PrI 


Parameters 




ai,/3i,ft,/33 




p.g,'- 


ai,/3 = /3i 


+ /33 


p,g 


Dimensions 




4 




3 


2 




2 


V operator 




^ + sfr/^i + afe/^2 + 


afe/^3 


i;P+i;^+fr- 


al7"i + 


^^" 


f.P + ii'i 


Gradient 




lim/32^/3i V(.) 




limr_+gV(.) 


V 




V 


Probability 
















^ [Pxy{0,0) - Px{0)Pym 




ai(l-ai)(/3i-/32) 




p{l-p)(r- g) 










"^ [Pxy(0,l) - Px{0)Py(l)] 




ai(l-ai)(/32-/3i) 




pCi-p)^-^) 










^ \Pxy{l,0) - Px{l)Pym 




ai(l-ai)(/32-/3i) 




p{l-p)(g-f) 










S/[Pxy{l,i)-Px{l)Py{l)] 




ai(l-ai)(/3i -/32) 




p(l-p)(r- g) 










Conditionals 
















v[p,|,(oio)-P4o); 




?i^(/^i-ft) 




ff^^c^--?) 










v[p,|,{oii)-P4o); 




^^^3^(^2-/31) 




£(i^(g_^) 










Expectation 
















V [{xy) - {x){y)] 




Qi(l-ai){/3i-/32) 




p{l-p)(r- q) 










Entropy 
















V [E^y -E^- Ey] 




7^0 




7^0 










Correlation 
















"^Pxy 




7^0 




7^0 











Table 2.1: A comparison of calculated results for mixed Vm o-nd behavioural Vb strategy 
spaces with those same spaces when subject to isomorphic constraints. We examine points 
where respectively the x and y variables are first perfectly correlated with p^y = 1 and 
then independent with p^y = 1. In the unconstrained behavioural spaces, all quantities 
are evaluated at points satisfying lim^^^i or lim(^,,)^(o,i) when p^y = 1, and at points 
satisfying lim^j^/^^ or lim.r^q when pxy = 0. The isomorphically constrained spaces are 
respectively indicated by Pm|^j=i and 'PbLj.)=(oi) /^'^ ^^^ perfectly correlated case, and 
Vm\r =« and Vb\.i-.^„ when the variables are independent. Game theory and probability 
theory assign different dimensionality and tangent spaces to these cases. Many calculated 
results differ between these spaces. 
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• V [E,y - Ej = 0. 

Alternately, when x and y are independent, the relevant spaces are 



lim/3^^/32 Vm mixed 

^A/|/3i=/3^ constrained mixed 



■2 



(2.13) 



lim^-s-q Vm behavioural 

VB\r=g constrained behavioural 

In all these spaces, the probability distributions satisfy 

• V [P.y - P^Py] = 

• V [P^ly - Pj = 

. V[{xy)-{x){y)] = 

• V [E^y — Ex — Ey] = 0. 

Table 12.11 records whether each of the expected relations is satisfied for each of the 
mixed and behavioural spaces when they are either unconstrained, or isomorphically 
constrained. As might be expected, the results indicate that the weak isomorphisms 
used to construct the mixed and behavioural spaces of game theory are not able to 
reproduce necessarily true results from probability theory. Hence, the rational player 
of game theory is unable to reliably reproduce results from probability theory. These 
differences between game theory and probability theory need to be resolved. 

2.3 Discussion 

The question posed in this chapter is whether a physical situation involving variables 
{x,y) defines a set of moves {x,y) G {(0, 0), (0, 1), (1, 0), (1, 1)} which then defines a 
mixed strategy space of three dimensions, or whether the variables (x, y) can be modeled 
by multiple distinct probability distributions (perfectly correlated, independent, anti- 
correlated, etc) each of which defines a set of possible moves and corresponding mixed 
strategy space. These two different approaches can each by modeled using a single mixed 
strategy space with or without isomorphism constraints. In this case, the question is 
whether the simple physical decision or game involving the variables (x, y) is best mod- 
eled by a single probability space which contains all others without using isomorphic 
constraints and alters the properties of those embedded spaces to reflect decision uncer- 
tainty, or by a single probability space using isomorphic constraints to perfectly preserve 
the properties of all embedded spaces. 



Chapter 3 

A simple decision tree optimization 



3.1 Optimizing simple decision trees 

We now turn to consider how the differences between probabihty theory and game theory 
inffuence decision tree optimization. We consider the usual two potentially correlated 
random variables depicted in Fig. 12.11 and will use both the unconstrained behavioural 
probability space Vb and the isomorphically constrained behavioural spaces PbL^ =„ for 
every value of the correlation state p G [—1, 1]. Our goal is to present an optimization 
problem in which a rational player following the rules of game theory cannot achieve 
the payoff outcomes of a player following the rules of probability theory. We suppose 
that a player gains a payoff by advising a referee of the parameters of the decision tree 
probability space (p, g, r) to optimize a given nonlinear random function. The referee uses 
these parameters to determine the value of the function and provides a payoff equivalent 
to this value. (If desired, the referee could estimate the probability parameters by using 
indicator functions and observing an ensemble average of decision tree outcomes.) 

3.1.1 Non-polylinear payoff functions 

There are many possible random functions which we could use, and some are listed in 
Table 12.11 We could choose any relations from this table of the form / = provided 
probability theory shows V/ = and game theory has V/ 7^ 0. When this is so, the 
function V/ acts effectively as a discrepancy vector. We focus on the squared magnitude 
of the length of the discrepancy vector and examine functions of the form F = 1 — | V/p. 
Immediately, probability theory will optimize this function at the point F = 1 while 
game theory will locate an optimum at F < 1. In particular, we choose 

/ = P,,(0,0) + P,,(0,0) (3.1) 

so 

F = l-|V[P.j,(0,0) + P,,(0,0)]|' 
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= l-\V[l-q + p{q + r-l)]\\ (3.2) 

In the unconstrained behavioural space Vb, a rational player will evaluate this as 

F = l-{l-q-rf -{1-pf -p^. (3.3) 

In turn, this will be maximized at points p = ^ and g + r = 1 to give a maximum payoff 
of F = - 

'^'^ ^ max 2 ' 

A contrasting result is obtained using the isomorphism constraints of probability 
theory where our player faces the optimization problem 



maxF = 1 — |V [1 — g + p(q' + r — 1)] I 
subject to pxy = p, Vp G [-1, 1]. 



(3.4) 



Our player might commence by adopting the constraint p^y = 1 implemented by (g, r) 
(0, 1) to give 



max F = 1 — |V[1 — g + p{q + r — 1)] 
= 1. 



(9,rO=(0,l) 



(3.5) 



This analysis leads to an optimum point at arbitrary p and (g, r) = (0, 1) and a maximum 
payoff of -Fjnax = 1- Self-evidently, the player would cease their optimization analysis at 
this point as the achieved maximum can't be improved. 



U: 









1 

3 





2 



1 
1 



Figure 3.1: A non-strategic decision tree over two stages where a variable x G {0, 1} is 
chosen in the first stage to condition the choice of a second variable y G {0, 1} in the 
second stage. The attained payoffs 11 are as shown. 



3.1.2 Poly linear payoff functions 

Of course, there are many random functions defined over decision trees which produce 
identical results when using or not using isomorphic constraints. We now briefly illustrate 



3.1. OPTIMIZING SIMPLE DECISION TREES 



47 



this using polylinear expected payoff functions, and consider optimizing the function 



max(n) = 2{x) + 3{y) - 4:{xy). 

subject to pxy = p, Vp G [— 1, 1] 



(3.6) 



over the decision tree of Fig. 13. 1[ Of course, simple inspection will locate the optimum 
at ((x), (y)) = (0, 1) giving an expected payoff of (11) = 3. However, we step through the 
process for later generalization to strategic games. 




x: 

y: 1 

n: 



1 



1 



Figure 3.2: The decision tree resulting when the variables x andy are perfectly correlated. 



There are an infinite number of correlation constraints to be examined, but several 



are straightforward. As shown in Fig. 13. 2[ when the variables are perfectly correlated at 
Pxy = 1 via the constraint (g,r) = (0, 1), we have (x) = {y) = {xy) giving 



(n) = (x). 

This is optimized by setting (x) = 1 giving an expected payoff of (11) = 1. 



(3.7) 












1 

3 





2 



1 
1 



Figure 3.3: The decision tree resulting when the variables x and y are independent. 
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Fig. 13.31 sets p^y = so the x and y variables are independent by using the constraint 
r = q. The expectations are now separable giving (xy) = {x){y) and 

(n) = 2(a;) + 3(y)-4(a;)(y). (3.8) 

As the (x) and {y) variables are independent, a check of internal stationary points and 
the boundary leads to an optimal point at ((x), {y)) = (0, 1) and an expected payoff of 

(n) = 3. 




x: 1 

y: 1 

n: 3 2 

Figure 3.4: The decision tree resulting when the variables x and y are perfectly anti- 
correlated. 

We lastly consider the case where the variables are perfectly anti-correlated. As shown 
in Fig. 13. 4[ when the variables are perfectly correlated at Pxy = — 1 via the constraint 

(g, r) = (1, 0), we have (y) = (1 — (x)) and {xy) = giving 

(n) = 3-(x). (3.9) 

This is optimized by setting (x) = giving an expected payoff of (IT) = 3. 

More general correlation states require use of, for instance, standard Lagrangian op- 
timization procedures. 

However, we here adopt a numerical optimization approach by first using the correla- 
tion constraint to write the r variable as a function of p, q and the correlation constant p, 
giving a function r = r_^.{p,q,p). In particular, when the correlation (Eq. 12. lip between 
X and y is pxy = p, and as long as both p 7^ and p 7^ 1, then the correlation constraint 
defines two surfaces in the (p, q, r) simplex at height 



p'-2q{l -p){p' - l)±pJp^ + Aq{l - q)^ 
^^^^'^'^) = 2[l+pip^-l)] ■ ^^-''^ 

The function r^{p,q,p) will give the correlation surfaces we require within the simplex. 
That is, when p = we have r+(p, q,0) = q as required. Similarly, when p = 1 we have 
r_|_(p, g, 1) > 1 across the entire (p, q) plane with the equality r+(p, g, 1) = 1 only where 
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g = or g = 1. We require p = 1 at (g, r) = (0, 1). Finally, when p = —1 and x and y are 
perfectly anti-correlated, we have r^{p,q,—l) < across the entire {p, q) plane with the 
equality r+(p, q, —1) = only where g = or g = 1. We require p = — 1 at (g, r) = (1, 0). 
The strict requirement that < r^{p,q,p) < 1 establishes permissible regions on the 
{p, q) plane. For < p < 1, the permissible region is bounded by the g = line and the 
line 



?(P,P) 



P 



P+t 



(3.11) 



Similarly, for — 1 < p < 0, the (p, g) region is bounded by the g = 1 line and the line 

1 



g(p, p) 



1 + P^ 



i-p2 

p2 



(3.12) 



The problem is then solved using a a typical Mathematica command line of [22] 



NMaximize[{inRange[r_|_(p, g, p)] x [2p + 3g — 3pg — pr^{p, g, p)] 
< p < 1 && < g < 1}, {p, g}]. 



(3.13) 



Here, a suitably defined "inRange" function determines whether r_|_ is taking permissible 
values between zero and unity allowing the payoff function to be examined over the entire 
(p, g) plane. The resulting optimal expected payoffs are follows: 



p 


ip,q,r) 


(n) 


+1 


(l.,0.,l.) 


1. 


+0.75 


(0.8138,0.3876,1.) 


1.03032 


+0.5 


(0.4831,0.5917,1.) 


1.40068 


+0.25 


(0.2590,0.7953,1.) 


2.02693 





(0.,1.,1.) 


3. 


-0.25 


(0.,1., 0.9378) 


3. 


-0.5 


(0.,1., 0.7506) 


3. 


-0.75 


(0.,1., 0.4386) 


3. 


-1 


(0.,1.,0.) 


3. 



(3.14) 



Some care must be taken to ensure convergence of the solution. This analysis makes it 
evident that the player can maximize expected payoffs by choosing a correlation constraint 
where x and y is independent (say) allowing the setting {p, g, r) = (0, 1, 1) to gain a payoff 
of (n) = 3. Other choices would also have been possible. 

We now turn to applying isomorphism constraints to the strategic analysis of game 
theory. 
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Chapter 4 

A simple two-player-two-stage 
optimization 



4.1 Optimizing a multistage game tree 

In this section, we show that the use of isomorphic constraints can alter the outcomes 
of strategic games even when expected payoff functions are being used. We will consider 
either the mixed strategy space Vm (Eq- I2.2p and the behavioural strategy space Vb 
(Eq. I2.7p or the isomorphically constrained behavioural spaces "PsL =„ for every value 
of the correlation state p E [—1, 1]. 

We consider a strategic interaction between two players over multiple stages as de- 
picted in Fig. 14.11 Here, two players denoted X and Y seek to optimize their respective 
payoffs 

X : maxn"^(a;,y) = ?> — 2x — y + Axy 

y : maxll (x,?/) = 1 + ?)X + y — 2xy. (4-1) 

Again, we assume a domain x,y E {0, 1} and that player X chooses the value of x and 
advises this to Y before Y determines the value of y. Players will either consider the 
payoff functions above or their expectations 

X : max(n^) = 3 - 2(x) - {y) + 4:{xy) 

F:max(n^) = l + ?,{x) + {y) - 2{xy). (4.2) 
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Figure 4.1: rtt;ci players, X and Y conduct a two-stage sequential game where X chooses 
the first variable x G {0, 1} and Y chooses the second variable y G {0, 1} conditioned on 
X. The payoffs for players are Yi^ and Y\^ . 

4.1.1 Unconstrained mixed space Vm 

For the unconstrained mixed strategy space Vm^ the expected payoffs for each player are 



(4.3) 



{yuVr) = 


(0,0) 


(0,1) 


(1,0) 


(1,1) 


((n^), (n^)) 


/3o 


/3i 


/32 


Ps 


"0 


(3,1) 
(1,4) 


(3,1) 
(4,3) 


(2,2) 
(1,4) 


(2,2) 
(4,3). 



Using this table, the expected payoff functions take the form 






3 - /32 - /33 + «i(-2 + 3/3i + P2 + 4/33) 
l + /32 + /33 + «i(3-/3i-/32-2/33) 



(4.4) 



while the unconstrained gradients evaluate as 



v(n^) 
v(n^) 



(-2 + 3/3i + 132 + 4/33)ai + 3«i/3i + (ai - l)/32 + (4ai - l)/33 
(3 - A - /32 - 2/33)ai - aiA + (1 - «i)/32 + (1 - 2ai)/33. 



(4.5) 



The expected payoff can then optimized by either comparing returns in the payoff table 
for each mixed strategy combination, or by the equivalent strategy of comparing the 
simultaneous rates of change of the payoff functions with the probability parameters. 
(To illustrate the second approach, the rate of change of (H^) with f3i is equal to —ai 
which is almost always negative indicating that payoffs are maximized by setting /3i = 0.) 
Either approach then locates the optimal mixed strategy of {ai, /3i, /32, f^s) = (0,0,1,0) 
leading to expected payoffs of ((n-^), {U^)) = (2, 2). 
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4.1.2 Unconstrained behavioural space Vb 

The unconstrained behavioural strategy space Vb is pictured in Fig. 12.11 The uncon- 
strained optimization problem faced by each player is 

X : max(n ) = 3 — 2p — q + pq + 3pr 

y : max(n^) = 1 + 3p + q — pq — pr. (4-6) 

The unconstrained gradients of the expected payoffs evaluate as 

V(n^) = {q + 3r-2)p-{l-p)q + 3pf 

V(n^) = {3-q-r)p+{l-p)q-pf. (4.7) 

This perfect information game can then be optimized by inspection, or by equating 
gradients to zero, or by using backwards induction. The resulting optimal pure strategy 
choices are {x,y) = (0, 1) giving payoffs of (11'^, 11^) = (2,2). 

4.1.3 Constrained behavioural space PrL ^ 

' Pxy — P 

We now consider the constrained behavioural spaces "PbL^ =p^'^P ^ ["^5 !]• The two 
players are non-communicating and it is generally not possible to use a single value for 
the correlation p, and this generally makes the analysis intractable. However, player Y 
has total control over the setting of the correlation p in three cases — when p = ±1 and 
p = 0. We consider these cases now. 

First consider the space Vb\px =i in which the variables are functionally equal so 
y = X = xy. (We can consider the payoff functions directly rather than their expected 
values.) In this space the players face the respective optimization tasks 

X:maxn^(x) = 3 + a; 

Y : n^(x) = 1 + 2x. (4.8) 

As a result, player X optimizes their payoff by setting a; = 1 giving the outcomes 

(n^n^) = (4,3). 

In contrast, in the space VB\p^y=-i, the variables are functionally related hy y = 1 — x 
and xy = 0. These constraints render the optimization tasks as 

X:maxn^(x) = 2-x 

X 

Y : n^(x) = 2 + 2a;. (4.9) 

Here, player X chooses x = to optimize their payoff leading to the outcomes (H"^, H^) = 
(2,2). 

Finally, when player Y chooses to discard all information about the x variable, then 
the variables x and y are independent and the chosen space is VB\p:,y=o- When the 
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variables are independent, there might not necessarily be a pure strategy solution and 
we need to optimize expected payoffs. In this space, we have (x) = p and (y) = q and 
{xy) = {x){y) = pq giving the optimization problem 

X : max(n^) = 3 - 2p - q + Apq 

Y : max(n^) = 1 + 3p + q - 2pq. (4.10) 

The best response functions or equivalent partial differentials are 

X : V ' = -2 + 4g 



dp 

(IF 

dq 



y.d(n_J = l_2p (4.11) 



locating the optimal point at {p,q) = (hA) with expected payoffs of ((II-^), (11^)) 



^2' 2' 
(- -) 

At this stage of the analysis, both players have separately calculated an equilibrium 
point in three spaces VB\p^y=p for p G { — 1,0, 1}, and the selection of these correlation 
states is solely at the discretion of player Y. The expected payoffs gained at each of these 
"local" equilibrium points can then be compared to obtain a "global" optimal expected 
payoff. For convenience, these are summarized here: 

p ((n^),(n^)) 

-' ^li (4.12) 

(if) ^ ^ 

+1 (4,3). 

Based on these results, player Y will then rationally optimize their expected payoff by 
choosing to have their variables in a state of perfect correlation with p = 1 in the space 
VB\p^y=i- Player X, also being a rational optimizer will play accordingly to give equilib- 
rium payoffs of ((n^), (n^)) = (4, 3). 

It is useful again to reemphasize a geometric picture. As shown in Fig. 14.2( a). an 
unconstrained behavioural space has a three-dimensional gradient everywhere which is 
non-zero even when x and y are perfectly correlated so payoffs are not optimized at any 
such points. In contrast, the use of isomorphic constraints when the x and y variables are 
perfectly correlated gives the situation in Fig. 14.2( b) where now a 1-dimensional gradient 
points solely along the p axis. A comparison in Fig. 14.2( c) of the resulting outcomes can 
then be made to determine which probability space should be chosen so as to maximize 
outcomes. 
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(c) 



„Y U^=2+2p 




Figure 4.2: (a) Game theory adopts an unconstrained joint probability measure space in 
which expected payoffs vary over three dimensions {p, q, r) and where positive gradients 
with respect to q and r (dotted arrows) and with respect to p (solid arrow) ensure that 
players maximize joint payoffs by choosing {p,q,r) = (0,1,0). (b) An alternate joint 
probability space where x is perfectly correlated to y in which expected payoffs vary over 
a single dimension p with positive gradients with respect to p (solid arrow) ensuring that 
players optimize payoffs by choosing p = 1. (c) The choice of two alternate probability 
spaces (more are possible) associates two different total gradients (double-lined arrows) 
with any point along the perfect correlation line p^y = 1 at {q, r) = (0, 1). In the absence 
of any effective decision procedure privileging any one space over another, players should 
examine all possible spaces, all possible gradients, and all possible optimized outcomes. 



4.1 A Strategic analysis difficulties 

The players might then seek to supplement the above solutions by considering a wider 
range of correlation states. The optimization task then becomes 



X :max(n^) 

p 

Y : max(n^) 



q,r 



3 — 2p — q + pq + 3pr 
1 + 3p + q — pq — pr 



subject to pxy = p, 'ip E [—1, 1]. 



(4.13) 



Unfortunately, there does not seem to be any straightforward way to make progress with 
the general correlation case. Players are non-communicating and hence cannot agree on 
a value of the correlation state p. If players adopt different values of the correlation 
states they must model conflicting global constraints and it is not clear how these can be 
resolved. An attempt to model the use of a single correlation state generates expected 
payoff functions which are not poly-linear in the probability parameters and that are 
not generally quasi- concave. This implies that existence theorems for Nash equilibria are 
inapplicable in these cases so equilibrium points might not exist for different correlation 
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states. It is more than likely that an acceptable solution methodology does not exist for 
strategic interactions in the general correlation case, and it is beyond the scope of this 
paper to consider this issue further. Here finally, we find the irreducible complexity of 
strategic analysis expected by von Neumann and Morgenstern. 

4.1.5 More general constrained analysis 

The choice of variable y is normally modeled as requiring two separate and independent 
coin tosses — see the behavioural space tree of Fig. 14. 1[ When x = a coin is tossed 
determining y = u & {O5 1} with respective probabilities (1 — ^,5), while when x = 1 
another coin is tossed determining y = v & {0, 1} with respective probabilities (1 — r,r). 
The u and v coins are then simple, biassed, independent coins. 

However, there is no need for this simplest possible treatment. The u and v coins 
could themselves be modeled using any of the alternate probability spaces of Eqs. 11.211 — 
11.341 These alternate probability spaces would need to be checked by rational players of 
unbounded capacity. 

Another possible probability space might consider the u and v variables themselves to 
be partially correlated. That is, the second stage player chooses to partially correlate their 
two behavioural strategies by employing two sequential roulettes. The first determines 
the variable u G {0,1} with probabilities (1 — q,q) while the second gives v G {0,1} 
with respective probabilities (1 — ri,ri) if u = and (1 — r2,r2) ii u = 1. The resulting 
correlation between the variables u and v is then 



puv{q, n, ra) = . ^ (4.14) 

[ri + q{r2 - n)] [1 - ri - g(r2 - n)] 



When ri = r2 then these variables are uncorrelated as usual. In turn, this correlation 
between the u and v variables renders the correlation between the x and y variables as 



Jpil - p)[ri - q{l + ri ~ r2)] 
Pxy{p,q,rur2) = —^= (4.15) 

[q + p{ri - q) + pq{r2 - ^i)] [1 - g + p{ri - g) + pq{r2 - ri)] 



The second stage player might then choose to adopt a probability space with a constant 
correlation between the u and v variables, say p™(g,ri,r2) = puv say. If puv = then 
we have the usual situation of uncorrelated behavioural strategies normally considered 
by game theory. Conversely, if puv = ± 1 we have respectively either perfectly correlated 
or perfectly anti-correlated behavioural strategies. If such a correlation constraint can 
be adopted, then both players should analyze this possibility to determine whether it is 
optimal. 

Even more strangely, the u and v coin tosses could themselves be partially correlated 
to the previous choice of x. That is, the u and v variables can be correlated with x, 
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and only after they have been chosen is the value for y assigned. For example, we might 
have u perfectly anti-correlated with x so u = 1 — x and v perfectly correlated with x so 
V = X, and then we assign y = u ii x = and y = v ii x = 1. There are many possible 
choices that might be considered. In particular, we might consider the 9 possible cases 
which arise when firstly the u variable is either perfectly anti-correlated to x (denoted 
V^), independent of x {Vq) or perfectly correlated to x [V^), and the v variable is either 
perfectly anti-correlated to x (denoted V^), independent of x (V^) or perfectly correlated 
to X (V^). We have introduced subscript symbols indicating these possibilities. That is, 
we separately have 





Su{l-x)) 


py 


P^{u) = < 


(i-g,g) 


VI 




^ ^ux 


n. 




Sv(l-x) 


py 


P'^iv) = < 


(l-r,r) 


n 




, ^vx 


^.^ 



(4.16) 



The right hand column here lists the shorthand notation for each adopted strategy. This 
notation shows that if u is independent of x while v is perfectly correlated to x, the second 
stage probabihty distribution adopted by player Y is V^^^. Similarly, when both u and 
V are perfectly correlated to x we have the probability distribution V\j^. Each of these 
choices of a different probability space generates a different optima within that space, 
and these optima must be compared so that players can decide which space they can 
rationally choose. Without showing the details, the generated outcomes in these possible 
spaces are 





((n^),(n^)) 


Tl_ 


(2,2) 


v\ 


(2,2) 


V\ 


(4,3) 


rl 


(2,2) 


K 


(2,2) 


K, 


(4,3) 


VI- 


(3,1) 


■pY 

' +0 


(3,1) 


' ++ 


(4,3). 



(4.17) 



These outcomes can easily be verified by drawing the different trees generated by each 
choice of joint probability space as shown in Fig. 14.31 This extended table of distinct 
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trees makes evident that again, within this range of considered joint probabihty spaces 
player Y optimizes their outcomes by choosing, for instance, the space V^_^ ensuring thai 
their choice is perfectly correlated with that of their opponent. 



We argue that optimizing multiple-player-multiple-stage games is more complicated 
than envisaged in conventional game analysis. As noted earlier, the strategic optimization 
of expected payoffs first requires the adoption of a suitable joint probability measure 
space, and it is only the adoption of such a space that permits the functional definition 
of both the expected payoff and suitable gradient operators allowing the optimization to 
be completed. For the above simple two player game, the expected payoffs and gradient 
operators have been respectively defined variously as 



((n^),(n^) 



{2-p,2 + 2p) 



{2 — p + 3pr, 2 + 2p — pr) 



(2 + 2p,2 + p) 



{3 - 2p - q + pq, 1 + 3p + q - pq) 



V 



Y 



Y 



V 



v\ 



K- 



{3 — 2p — q + pq + 3pr, 1 + 3p + q — pq — pr) V^ 



Y 
00 



(3 + p - g + pg, 1 + 2p + g - pg) 



(3 - 2p, 1 + 3p) 



(3 — 2j9 + 3pr, 1 + 3p — pr) 



T)Y 

' 0+ 



Y 



V 



-pY 



(3 + p, 1 + 2p) 



V 



Y 



(4.18) 
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(^^^^) 
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p\ 




P\ 
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p\ 












(^^^^) 




Figure 4.3: The nine distinct trees, payoffs and equilibria (indicated by triangles) given 
that players X and Y adopt the indicated joint probability space. The two subscript 
symbols here respectively indicate whether each of player Y 's second stage choices are 
perfectly anti- correlated ("—"), uncorrected ("0"), or perfectly correlated ("+") to the 
previously observed random variable x. 
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(4.19) 



That is, the expected payoff is defined here as a joint functional mapping from the various 
probabihty measure spaces to the reals via 

V^ X Vl^.V^ X Vl_,V^ X Vl^.V^ X P^+} ^ IR X IR. (4.20) 

Again, this is in sharp contrast to the usual definition of game theory that it is sufficient 
for optimization to consider that the expected payoff is defined as the joint function 
mapping 

((n^), (n^)) : V^ X Vl, ^ IR X IR. (4.21) 



4.2 Backwards induction and isomorphism con- 
straints 

We have mentioned above that backwards induction can be used to solve the uncon- 
strained optimization problem. This approach is often presented as a 'proof that no 
alternative procedure could possibly be considered by a rational player. It is worth tak- 
ing a closer look at what is involved in the backwards induction algorithm, and how it 
interacts with isomorphic constraints. 

Backwards induction first constrains the values of first stage probability parameters 
and then evaluates the gradients of the expected payoff function (11^) at different nodes 



4.2. BACKWARDS INDUCTION AND ISOMORPHISM CONSTRAINTS 



61 



in the last stage of the game. These last stage gradients are then used to set the optimal 
values of the (g, r) probability variables. These values are then applied as constraints to 
the evaluation of the gradient of the expected payoff function (11"^) in the first stage of the 
game — the first stage probability parameters are now treated as variables. To illustrate 
these steps, we choose to begin our analysis at a point in the behavioural strategy space 
where the variables are perfectly correlated at {q,r) = (0, 1). The steps involved are: 



lim 

(q,r)^{0,l) 

lim 

(5,r)^(0,l) 



^(n^)lp=o 

dq 

d{uy)\ 



p=i 



dp 



dr 



(g,r)=(i,o) 



1 > 0, so g -)■ 1 



-1 < 0, so r -)> 



—p < 0, so p — 7> 0. 



(4.22) 



The optimal point is then at (p, g, r) = (0, 1, 0) giving payoffs of ((n^), (H^)) = (2, 2). 

It is very easy and straightforward to apply the backwards induction algorithm to an 
isomorphically constrained space, provided that the global isomorphic constraints and the 
altered geometry is taken into account. If the variables x and y are perfectly correlated 
then the game tree reduces to a single stage and backwards induction is properly applied 
to that single stage. However, problems arise when as is common, it is argued that 
backwards induction must be applied to both stages even when the x and y variables 
are perfectly correlated. This argument presupposes that backwards induction overrides 
isomorphic constraints and the altered game geometry. 

To see how this is done, let us imagine trying to apply the backwards induction 



algorithm to an isomorphically constrained perfectly correlated space Vb 



{q,r)={o,i) 



with 



p = 1. The above evaluations then try to combine limit processes, gradient evaluations, 
and isomorphic constraints with global scope. That is: 



(g,r)^(0,l) 



lim 

(<?,r)-^{0,l) 



dq 



? 



p=l 



9(n^) 



dp 



dr 



{q,r)={0,l) 



(g,r)=(0,l) 



(g,r)=(0,l) 



(g,r)=(l,0) 



(4.23) 



Mathematically and logically, these statements make little sense. An isomorphic con- 
straint of global scope sets the values (g, r) = (0, 1) and then backwards induction seeks 
to treat these parameters as variables and evaluate a gradient with respect to these vari- 
ables. In actuality, these variables no longer exist in this constrained probability space as 
there is no second stage in this probability space. The altered probability space geometry 
has altered the game try to include only one stage and one probability parameter. 
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Let us try a slightly more general treatment. Consider briefly the optimization by 
player Z G {X, Y} of an example two stage game where x is known before y is decided 
giving 

(n^) = y: P^(x)P^(i/|x)n^(x,i/). (4.24) 

x,y=0 

The conventional analysis begins by drawing a single game tree capturing every possible 
move that might be made along every history, and assigning independent distributions 
to each decision point which can then be optimized via backwards induction. Then, 
backwards induction begins by optimizing the last stage first via, for instance, evaluations 
like 

d 



P'^ix'^ 



1 - P^'iyV)) n^(a:', 1 - y') + P''iy'\x')U'ix',y') 



dP^{y'\x' 
P'^ix') (U^{x', y') - n^(x', 1 - y')) . (4.25) 



Implicit in this evaluation, is the assumption that the gradient operator ^pyf ,i ,-, com- 
mutes with the distribution P^{x') via 

^ -P^{x') = P^{x')—-^-—-. (4.26) 



dP^{y'\x') ' ' ' 'dP^{y'\x')' 

This is only true under the assumption that the distributions P^(|/|x) and P^{x) are not 
functionally dependent. When this is not the case, then obviously, the above commutation 
relation cannot be used. Speaking figuratively, for longer N stage games, backwards 
induction relies on similar independence assumptions allowing gradients with respect to 
i^^ stage distributions Pi to commute with all earlier stage distributions, giving (loosely) 



max (n ) = max 

Pi,P2,...,Pn-i,Pn Pi 



y^ . . . max y^ . . . max T^ . . . max T^ . . . 

"^ P2 [^ Pn-1 [^ Pn ^^ 



(4.27) 



Again, commuting latter stage gradient operators through all preceding earlier stage 
distributions is only valid under the assumption that these distributions are not func- 
tionally dependent. These assumptions are not necessarily true, and we suggest that 
rational players will consider the case where they are not warranted. 

In our approach in contrast, we hold that the functionals (H^) cannot be represented 
by a single game tree of finite size, and that they possess neither dimensionality nor 
continuity properties. While they are a mapping into a range of reals, their domain 
sets are essentially unspecified. In fact, and crudely put, if S is the set of all possible 
feasible spaces for this game, say S = {1R^,IR^, . . .}, then the functional is a mapping 
from the set of all possible feasible spaces to the reals, (H^) : S* — ?■ IR. Just as a 
topological space possesses dimensionality but lacks any measure of distance and only 
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gains such measures with the adoption of a metric, these expected payoff functionals do 
not even possess dimensionahty prior to the adoption of a suitable probabihty measure 
space. In fact, the mapping (11^) must be defined over every possible probability measure 
space. For all these possible space, within any adopted probability measure space, (H^) 
becomes a function of fixed dimensionality and specified continuity and differentiability 
properties which can be described by a suitable decision tree. Such a tree then supports 
the backwards induction and subgame decomposition operations which can then be used 
to optimize pathways through this particular tree, one instance among many of the trees 
definable using the entire mapping (11^). 

The adoption of a probability measure space inducing correlations between any game 
variables alters the structure of the decision tree to create an irreducible whole entity 
which must be optimized as a single unit. Backwards induction and subgame decompo- 
sitions cannot be improperly used to break these indivisible units as any such attempt 
is simply mathematically invalid. This has profound implications canvassed later for the 
evolution of hierarchical complexity. 

When player Y chooses an alternate probability space such as V^^ in which all of the 
second stage choices are perfectly correlated with their opponent's previous move, then 
they possess no free parameters and so have nothing to vary to optimize their payoff. 
This restriction of their ability to vary their second stage choice has been implicitly 
considered to be a reason for not using the correlated probability space V^_^_ in favour 
of the conventional space P^. This latter probability space allows players to consider 
all possibilities in the second stage, thus justifying the use of this probability space. 
However, this is a misleading argument. No reasons have ever been provided for why a 
player should restrict their analysis to a single space. Lifting this restriction requires them 
in turn to choose which space offers them the greatest range of choice. Rather, the player 
can perform their optimization by first choosing among the infinite number of available 
probability spaces, and then optimizing over every parameter defined within each space. 
In some spaces they consider, they will possess a certain number of parameters to vary, 
and in other spaces they will possess a different number of parameters to vary. Certainly, 
some spaces will offer no free parameters to vary, but nothing is lost by having a player 
consider this as one option among many. It is the conventional analysis which restricts 
player searches by forcing them to consider only a single type of probability space. 

It has also been argued that, even when player Y intends to adopt correlated second 
stage play, their observation that player X chooses x = in the first stage will require 
player Y to rethink their desire to adopt a correlated strategy so they should then seek 
to optimize their outcomes given that the choice x = has been made. In effect, this ar- 
gument presupposes that player Y has adopted the conventional probability space which 
allows this player to have a further choice in the second stage. As emphasized above, 
one of the firmest results of probability measure theory is that joint probability distri- 
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butions are separable if and only if all the variables are independent. That is, different 
variables can be separately optimized if and only if they are are described by separable 
joint probability distributions and this occurs if and only if they are independent. This 
means that it is only when variables are independent that a subgame decomposition be 
performed allowing players to separately optimize decisions in each subgame. It is a non- 
sense to argue that non-independent and non-separable variables are actually separable 
and hence separately optimized. When player Y has made a prior choice to adopt the 
probability space 'P+_|_, then they have freely chosen not to have a choice in the second 
stage, and they will compare the payoffs stemming from this choice with those available 
from alternate choices. 

To reiterate previous points, a coin consists of many components possessing correlated 
dynamics, and these correlations permit the construction of a coin decision tree with only 
two branches indicating Heads or Tails. A pseudo-random number generator consists of 
millions of components all possessing correlated dynamics so again, the total decision 
tree might possess only two branches. Correlation between variables reduces the size of 
decision trees, and alters the dimensionality of expected payoff functional spaces. 

4.3 Optimizing over multiple joint probability spaces 

We now have multiple possible joint probability spaces. In these alternate spaces, the 
expected payoff functions possess exactly the same value when x and y are perfectly 
correlated but possess entirely different gradients at this point. Variational optimization 
principles insist that every possible functional form and gradient must be taken into 
account in any complete optimization. These principles permit players to infinitely vary 
the "immutable" functional assignments defining any space (i.e. y = 6xoU + S^iV and 
y = X above), providing access to a vastly larger decision space than usually analyzed in 
game theory. It is not a question of which space is best, rather, it is a question of either 
restricting the analysis to a single space or allowing players to analyze all possible spaces. 

Game theory adopts expected payoff "functions" allowing examination of every pos- 
sible combination of payoff values and assumes that this is sufficient for optimization. 
However, while these functions can duplicate every possible payoff value, they cannot 
duplicate every possible functional dependency or gradient — and optimization depends 
on these dependencies and gradients. 

More generally, in our approach, rational players are able to perform an entirely 
unconstrained search of every possible joint probability space to optimize their payoffs 
via 

X: max (H^) = / dPF^ n^(a:,l/) 

(4.28) 
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Y: max (H^) = / dPF^ ll^{x,y). 



Here, each player's optimization is over every possible probability space that might be 
applied to their problem. Game analysis then requires players to jointly define a prod- 
uct probability space V^ x V"^ where player X is responsible for V^ and player Y is 
responsible for P^. As noted above, each player Z can use any of an infinite number of 
alternate probability spaces which we here enumerate Vf for i = 0,1,2, .. .. (The num- 
ber of probability spaces is non-denumerable.) Because each player must optimize their 
choices given the choices made by their opponent, then both players must analyze every 
possible joint probability space Vf x Vj ior i,j = 0,1,2, .. .. Each player is then faced 
with the task of sequentially analyzing what happens given the adoption of every possi- 
ble joint probability space, and then optimizing their own payoffs within each adopted 
probability space, and then comparing the payoffs attainable from each joint probability 
space to determine which space both they and their opponents will adopt. 

In contrast, conventional analysis mandates that players must necessarily adopt a 
single probability space (whether mixed or behavioural) leading to what is effectively a 
heavily constrained optimization 



X: max (H^) = / dP^/ n^(x,l/) 

Y: max (H^) = / dP^/ n^(x,y) 



subject to V^ = V^, V^ = Vq . (4.29) 

That is, of all the possible joint probability spaces that might be adopted, game theory 
restricts its rational players to a single mandated choice. And this without ever proving 
that this single choice is somehow optimal. 

We argue that optimization theory and probability theory are entirely consistent 
with the fact that a known correlation state between random variables will infiuence the 
dimensionality and gradients of an optimization problem. In view of this, these fields offer 
no reasons whatsoever for the necessity of the constraint shown in the last line above. 

4.3.1 Rational game play: A story 

Let us make the mathematics more concrete by telling a story in an attempt to assist 
conceptualization of the new methods presented here. 

Suppose that you are the first player, player X, in the example two stage game. As 



shown in Fig. 14. 4[ you are in a room with your opponent, player Y, and together, you 
are looking at the game playing equipment. As player X, you play first and have to drop 
a large ball down one of two channels marked a: = or a; = 1. To assist your decision. 
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Figure 4.4: The conventional play of the two stage game features a closed room containing 
players X and Y , their respective randomizing urns used to implement mixed strategies, 
and a large metallic apparatus featuring a ball, and different channels and cups to act 
as a decision recording device. Player X implements their "p" randomization by draw 
either a white or black marble from their urn, and correspondingly drops the ball down 
the X = or X = 1 channel. Player Y picks up the ball, selects the relevant urn imple- 
menting either their "q" or "r" randomizations, draws either a white or black marble, 
and correspondingly drops the ball down the appropriate y = or y = 1 channel into the 
waiting cups. Payoffs are assigned as shown. 
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you have an urn containing a prepared number of white or black marbles allowing you 
to implement a randomized mixed strategy by selecting x = with probability 1 — p or 
X = 1 with probability p. You have chosen p so as to maximize your payoff. You are also 
aware that after your ball has landed in the appropriate cup, your opponent, player Y, 
will choose one of their two randomizing urns which each contain appropriate numbers of 
white and black marbles. The first urn allows player Y to choose y = with probability 
1 — g and y = I with probability q, while the second urn allows them to choose y = 
with probability 1 — r and y = I with probability r. Player Y has chosen q and r so 
as to maximize their payoff. After determining their choice oi y = oi y = 1, player Y 
will drop the ball down the appropriate channel so that it lands in the waiting cup to 
provide a permanent record of each players decisions. The players then divide a payoff 
accordingly as shown in Fig. 14.41 As shown in previous sections, a conventional analysis 
results in the play combinations (x, y) = (0, 1) and respective payoffs of ((n^), (n^^)) = 
(2,2). The above situation captures the conventionally mandated procedure for payoff 
maximization in this particular strategic interaction. It is presumed that the specified 
use of the respective urns by each player along with the conventional analysis specifying 
the values of p, q and r suffices to optimize player payoffs. What could be simpler? 

Notice however that game theory has never provided a proof that the above proce- 
dure is complete, necessary, or sufficient. In particular, von Neumann and Morgenstern 
explicitly used a method of "indirect proof subject to later falsification and so did not 
prove the completeness, the necessity, or the sufficiency of their methods. Nash simply 
adopted a mixed strategy probability space as the simplest way to provide an existence 
proof for what are now called Nash equilibria. Kuhn established only that mixed and 
behavioral probability spaces were equivalent in games of perfect recall, but did not es- 
tablish that they were complete, necessary, or sufficient. In fact, no-one has ever provided 
a mathematical proof of the completeness, the necessity, or the sufficiency of preferring 
one probability space over all others. Absent such proof, we suggest that rational players 
will explore every feasible probability space describing any given game. In the absence 
of any confirmed decision procedure mandating the use of one probability space over 
all others, we suppose that players have the capacity to examine alternate probability 
spaces, and choose between them so as to maximize their payoffs. 

Accordingly, suppose now that player Y adopts a different procedure to that conven- 
tionally mandated. Suppose in fact that player Y walks into the game room equipped 
with a toolkit containing hacksaws, hammers, and welding equipment, and suppose that 
before the game commences they set to work to reconfigure the decision recording de- 
vice. As player X, you gaze in appalled fascination as Y hammers, cuts, and welds away 
until the result is as shown in Fig. 14. 5[ As the time to start the game approaches, you 
have a decision to make. Your eyes provide you with evidence that the decision making 
device has been altered. Your previous analysis was based on the conventionally man- 
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Figure 4.5: The correlated play of the two stage game features players X and Y and an 
altered decision recording apparatus. Player X implements their "p" randomization as 
usual and drops the ball down either the x = or x = 1 channel. Player Y has used their 
toolkit to alter the device so they no longer have any decision to make as the ball simply 
continues falling down an extended channel to the waiting cups. Payoffs are assigned as 
shown. 



dated device structure, but its alteration makes the previous analysis irrelevant and in 
all likelihood, wrong. As player X, you might seek to remonstrate with your opponent 
by saying that they cannot alter the definition of the game and that it is mandatory 
that they use the conventionally mandated space. In response, player Y simply responds 
that they have not altered the game structure in any way, but have merely adopted a 
probability space which correlates their decision to the previous choice by X. Every 
single move of the game is still present but some have zero probability assigned. This is 
always possible. Conventional analysis allows such assignments of zero probability but 
then insists that these assignments can be altered by gradient optimization operations. 
In contrast, Y asserts that they have assigned zero probabilities to certain moves which 
cannot be altered by gradient optimization operations as is specifically allowed by prob- 
ability measure theory. Further, Y knows of no proof proving the conventional mandate, 
and as they are solely motivated by a desire to maximize their payoff, they will take any 
steps appropriate to that goal. Your decision is whether to close your eyes to the altered 
nature of the decision making device and continue to argue that any such alteration is 
irrational and non-payoff maximizing, or to take the evidence of your eyes into account 
and to alter your analysis. What decision will you make? Self-evidently, as player X, 
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after the game has coinmenced, you will now choose to drop your ball down the x = 1 
channel as that maximizes your payoff. Any other choice will minimize your payoff, and 
as a payoff maximizing rational player, you will not make such a choice. The result, as 
shown in previous sections, is that a correlated analysis results in the play combinations 
(x, y) = (1, 1) and respective payoffs of ((n^), (H^) j = (4, 3). This provides an increased 
payoff for player Y justifying their rebuilding of the decision recording device. 




Figure 4.6: The play of the two stage game when player X is unsure how player Y has 
reconstructed the decision recording apparatus. Player X implements their "p" random- 
ization as usual and drops the ball down either the x = or x = 1 channel. Player Y 
might be using the conventional apparatus of Fig. \4.4\ or the correlated apparatus of Fig. 
4-^ Payoffs are assigned as shown. 



But that doesn't end the story as it is entirely unreasonable that player X perfectly 
knows how Y is making their decisions. We now suppose that you, as player X, have 
watched your opponent walk into the game room with their toolkit and a large rectan- 
gular metal shield. Player Y erects their shield to entirely hide their part of the decision 
making device from your gaze, and behind this shield, they proceed to saw, hammer and 
weld away. You, as player X, are however entirely unsure what Y is doing behind their 
shield. Perhaps Y is reconstructing the original channel arrangements of the convention- 
ally mandated device of Fig. 14.41 Perhaps on the other hand, player Y is leaving the 
channels exactly as configured in the correlated decision device of Fig. 14.51 and the weld- 
ing is required to reconstruct the required "g" and "r" urns. The resulting situation, as 
perceived by yourself, is as shown in Fig. 14. 6[ Here, both you and player Y are depicted 
as being certain about how player X will optimize their payoff. Namely, X will use an 
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urn to implement some mixed strategy "p" to optimize their payoff. However, you, as 
player X have no information about how player Y will make their decision. Again, you 
have a decision to make. A conventional analysis mandates that player Y should use a 
conventionally configured decision device and you should play accordingly. In this case, 
Y will gain a payoff of (H^) = 2. However, Y could alternatively choose to adopt a 
correlated probability space in which case they will gain a payoff of (H^) = 3. Being 
rational, Y can be expected to seek to maximize their expected payoff. What will you 
do? Will you assume that Y has adopted a conventionally mandated space and drop 
the ball down the x = channel in the hope that it stops half way requiring Y to walk 
over to the device to place it in the y = 1 channel. What a disappointment then if the 
ball drops all the way down both the x = and y = channels into the leftmost cup. 
Or alternatively, will you assume that Y is indeed a payoff maximizer able to alter their 
choice of decision device leading to the conclusion that Y will have chosen to reconfigure 
the channels to implement correlated play. In this case, you should drop the ball into the 
X = 1 channel in the hope that the ball will drop all the way through both the x = 1 and 
y = 1 channels into the rightmost cup. What a disappointment then if you see the ball 
stop half way requiring Y to walk over to place the ball into the y = channel. What is 
your choice? 

We suggest that if you know (by observing) that Y has perfectly correlated their choice 
of y to your choice of x, then you must take this information into account. Similarly, even 
without direct observation, if you can deduce that Y will perfectly correlate their choice 
of y to your choice of x, then likewise, you must take this information into account. 

In reality of course, the situation in a real strategic exchange is more akin to that 
shown in Fig. 14.71 Here, each player knows precisely the rules of the game including 
all possible moves in their specified sequences. What they don't know is the choice of 
probability space made by their opponent. This ignorance is represented by the coloured 
shields shown in the figure. In fact, prior to their completing their own analysis, they 
do not know which probability space they will adopt, or whether they will choose a 
single space or randomize over a number of spaces. This is in sharp contrast to the 
presumption of conventional game theory which mandates that each player must use a 
particular probability space (or one of their equivalents). As noted above, there has 
never been a proof of the completeness, necessity or sufficiency of this mandated type of 
space. In view of this, we suggest that rational players will simply optimize their choice 
of probability space to maximize their expected payoff. In Fig. 14. 7[ you, as player X, 
must deduce which space player Y will use to maximize their payoff. In the situation 
depicted here, Y has not physically reconstructed the decision recording device before 
your eyes, but they have likely chosen to adopt a particular probability space and physical 
randomization device. Their roulette might involve their preprogramming one or more 
random number generators, or might involve their providing instructions to an agent who 
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Figure 4.7: The play of the two stage game when both player X and player Y are unsure 
about which probability spaces and randomization devices have been adopted by their op- 
ponents. In this case, each player perfectly shields their decision making apparatus from 
their opponent (shaded devices), and so might be adopting the conventionally mandated 
analysis or any of an infinite number of alternate possible probability spaces. Rational 
players will analyze all these possibilities in order to maximize their payoffs. Payoffs are 
assigned as shown. 

will act autonomously once the game has begun allowing Y to leave the room and take no 
further part in the game. As player X, you have absolutely no information whatsoever 
about which roulette will be adopted by Y. The only fact you are sure of is that Y will 
act so as to maximize their payoff. 

The question is, as always, is it possible for Y to vary their choice of probability 
space, of their roulette, or is this impossible? If it is impossible, provide a proof of 
this conjecture, and then optimize accordingly. If it is possible, determine your optimal 
choices taking into account your opponent's optimal choices. 

4.4 Discussion 



We propose that rational players will optimize their expected payoff functionals (not func- 
tions) in strategic situations using generalized calculus of variations approaches. These 
generalized variational functional optimization methods examine every possible value of 
a functional at every point as well as every possible gradient through that point. A ratio- 
nal player, seeking to perform a complete optimization, must examine every one of these 
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possibilities against all of the equivalent range of possibilities of their opponents. These 
generalized methods give access to an infinity of non-independent and functionally con- 
strained probability measure spaces defining non-continuous expected payoff functionals 
defined over discontinuous domains possessing, perhaps, a gradient nowhere. 

The resulting generalized optimization approach corresponds to optimizing an infi- 
nite number of alternate game decision trees exhibiting altered optimal pathways and 
equihbria. 

In this work, we follow the same methodology used by von Neumann and Morgenstern 
[T]. These authors initially focussed on single players, typified by Robinson Crusoe, who 
tried to optimize their payoff by choosing their actual moves or pure strategies in a 
consumption game. They then showed that this optimization method (focussed solely on 
pure strategies) did not generalize to all multiple player games leading to the introduction 
of probability distributions over pure strategies, defining mixed strategies. That is, it 
was established by these and later authors that while certain games (single player or 
multiple-player-perfect-information games) had solutions in pure strategies, this was not 
always true of more general games, and as a mixed strategy analysis entirely subsumes a 
pure strategy analysis, it was always advisable for a rational player to perform a complete 
mixed strategy analysis for general games. Here, we suggest similar results. It seems to be 
sufficient to employ conventional analysis for single-player or multiple-player-single-stage 
games. However, we suggest that the complete analysis of multiple-player-multiple-stage 
games requires more than a conventional analysis. Again, as the conventional analysis is 
entirely subsumed within our augmented optimization approach, it seems advisable for 
rational players to perform an augmented analysis in general. 

In earlier chapters, we have alluded to the possibihty that our expanded optimization 
analysis would produce results which differ from standard results in game theory. This 
does not mean that game theory is wrong. Just as a theorem valid in a flat geometry — the 
interior angles of all triangles sum to 180 degrees — can be invalid in a curved geometry, 
then so can results validly derived in game theory be invalid in our extended analysis. 
Game theory is incomplete, rather than wrong. 

For instance, Kuhn established that games of perfect recall could always be decom- 
posed into discreet subgames, and that the equilibrium pathway of the entire game con- 
sisted of concatenated portions of the equilibrium pathways of all the relevant subgames 
[4j. Crucial to the proof of this result, is the separability of the joint probabihty distribu- 
tions of the entire game, and such separability exists only for the independent behavioural 
probability spaces developed by Kuhn. In our approach, behavioural strategies are not 
necessarily independent so their governing probability spaces are not necessarily sepa- 
rable. A theorem derived assuming that probability distributions is separable, is not 
applicable when distributions are inseparable. 

Similarly, in the same paper, Kuhn established that games of perfect information 
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always have pure strategy equilibria [1]. In our approach, even in perfect information 
games, players are uncertain about which probability space might be adopted by their 
opponents, and this allows equilibria to be probabilistic. Again, there is no contradiction 
with existing results, as theorems derived assuming separable probability distributions 
are inapplicable when distributions are inseparable. 

All of the results and theorems of game theory are derived under certain assumptions 
about the joint probability spaces governing game analysis. When players can adopt al- 
ternate probability spaces invalidating these assumptions, then naturally, they can derive 
results which differ from those of game theory. Such differences reflect limitations in the 
optimization analysis of game theory, rather than errors in our more general optimization 
approach. 

Finally, we again remind ourselves that conventional analysis routinely predicts out- 
comes at odds with observation. As we later show, the extended analysis that we argue 
must be available to players of unbounded rationality, will produce outcomes entirely 
consistent with observation. 

Obviously, there are immediate applications of our new methods to sequential games 
such as the chain store paradox, the trust game, the ultimatum game, the public goods 
game, the centipede game, and the iterated prisoner's dilemma. We turn to this now. 
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Chapter 5 
Correlated Equilibria 



5.1 Introduction 

We are introducing isomorphism constraints into the strategy spaces of game theory. 
These constraints alter strategy space geometries to allow the location of new equilibria. 
It is useful to contrast out approach with Aumann's "correlated equilibria" . 

5.2 Correlated equilibria 

In 1974, Aumann modeled a nominally competitive game in which players coopt pub- 
lic roulettes and share information to improve their payoffs. This possibility arises as 
the Nash equilibria for non-communicating players has them locating the best payoff 
regardless of their opponent's choices so correlated changes of strategy are impossible. 
Given the ability to communicate however, correlated strategies become possible allow- 
ing novel equilibria. Following Aumann's terminology, these are now termed "correlated 
equilibria" . 

Our work here differs from Aumann's approach. We allow players to alter their cho- 
sen private randomization devices but do not permit communication between players. 
We show that even without additional communication channels, if players use differ- 
ent physical randomization devices with different numbers of independent coordinates 
and functionally constrained coordinates, then these possible probability spaces must be 
taken into account. To clarify the difference and similarities between our entirely non- 
communicating analysis and Aumann's correlated equilibria, we here go through one of 
the examples used by Aumman in detail. 

To model correlated equilibria, Aumman introduced probability measures into his 
definitions of needed 

equipment for randomizing strategies, and for defining utilities and subjective 
probability for the players. Thus to the description of the game we append 
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the following: 

(5) A set Q (the states of the world), together with a a- field B of subsets of 
Q (the events); 

(6) For each player i, a sub-a-field Xj of B (the events in Xj are those regarding 
which i is informed). 

(7) For each player i, a relation ^j (the preference order of i) on the space 
of lotteries on the outcome space X, where a lottery on X is a S- measurable 
function from fi to X 1231. 



This welter of definitions was made understandable by use of a series of worked examples, 
and we here follow the same route by examining in detail Aumann's example (2.7). 



n^n''; 
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Figure 5.1: The game tree for the two player non-zero-sum game considered by Aumann 
in his example (2. 7) f2^. Here, two players X and Y simultaneously and independently 
choose one of two options x,y E {0, 1} to gain the payoff combinations shown. 



In Aumman's example (2.7), the two-person payoff matrix is 



P„ 



X rry^ 



(n^,n 





1 







(6,6) (2,7) 
(7,2) (0,0) 



(5.1) 



In terms of the behavioural probability space defined in Fig. 15. H the expected payoff 
optimization problems are 



X : max (H^) 
p 

Y : max (H^) 

qr 



6 + p — 4g — 3pq 
6 — 4p + g — 3pq. 



(5.2) 



These expected payoffs are continuous multivariate functions dependent only on the freely 
varying parameters {p, q) so the relevant gradient operator used by both players to analyze 
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this particular probability space is 



d_ d_ 

dp'' dq 



(5.3) 



Optimization then proceeds as usual via 

dp 
dq 



l-3g 



l-3p 



(5.4) 



so equilibria appear at the intersections shown in Fig. 15.21 As noted by Aumann, there are 



three Nash equilibria for this game at choices {p, q) = (0, 1), (1, 0), and 
respective payoffs ((H^), (H^)) = (2,7), (7,2), and 



•1 1^ 

3' 3' 



generating 



14 14n 
3 ' 3 )■ 



1.2 




Figure 5.2: The intersection of the gradient conditions specifying Nash equilibria for the 
two player non-zero-sum game considered by Aumann in his example (2.7) W^ . The 
three Nash equilibria points are circled. 

Aumann now supposes that the players share a public 3-sided fair dice allowing events 
"A", "B", and "C" to be selected with probability |, and that X is informed whether or 
not event "A" appeared, while Y is told whether or not "C" appeared. Aumann then asks, 
given this altered environment with additional communications, how will players now 
optimize their expected payoffs. As a first step, the players must alter their probability 
spaces to reflect the changed physical randomization devices being used. 

One possibility is depicted Fig. 15.31 Here, event E G {A, B, C} occurs each with 
probability of 1/3 and conditions two additional variables M,f G {0, 1}. Player X knows 
the value of the variable u while player Y knows the value of v. The variable u is set to 
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Figure 5.3: The modified game tree corresponding to the players sharing a three-sided 
dice selecting an event E = A, B, or C with equal probability | with player X advised 
whether event A occurs or not (specified by the indicator variable u) while player Y is 
advised whether event C occurs or not (indicated by the indicator variable v). The players 
can then appropriately condition their decisions on their available information sets, as 
indicated. The respective information sets are not adequately represented on this figure. 

u = 1 when E = A and u = otherwise, while v = 1 when E = C and v = otherwise. 
The players can condition their subsequent choices on the u and v variables. 
The altered expected payoff functions are then 



X:max(n^) = ^ P{Euv,x,y)U^{x,y) 

Euv,x,y 



J2 PiEuv)P^{x\Euv)P^{y\Euv)U^{x, y) 

Euv,x,y 
1 

- [18 + 2po + Pi - 8go - 4gi - 3 [piq^ + po^o + Po^i]] 



r:max(n^) = ^ P{Euv,x,y)Ii^ {x,y) 

Euv,x,y 



J2 P{Euv)P^{x\Euv)P^{y\Euv)U^{x, y) 

Euv,x,y 

- [18 - 8po - 4pi + 2go + gi - 3 [piqo + po^o + Poqi]] ■ (5.5) 



written in terms of the joint probability distribution P{Euv, x, y) spanning the probability 
space, and where we recognize that the payoff functions Il^{x,y) depend only on the 
choices x and y, and we also take account of the various conditioning possibilities of the 
variables. 
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Consequently, in this expanded probability space the relevant gradient operator is 

V= l-^,,::^,,:;-,^^) (5.6) 



d d d d 
dpo' dpi' dqo' dqi 



in terms of which the players evaluate 

dpo 

g(n^) 

dpi 
dqo 

dqi 3 

The second and fourth lines here specify that 

1 if go < I 



-(2-3go-3gi) 
i(l-3go) 



-(2-3po-3pi) 



;i - 3po) 



(5.7) 



Pi 



arbitrary if go 



gi 



if go > I 

1 if Po < I 



< arbitrary if Pq 



(5.1 







iipo> k 



which in turn allows calculating the flow diagram for the remaining gradients in terms 
of the variables po and go as shown in Fig. 15.41 This locates two unstable stationary 
points at (po, go) = (I? 3) and (|, |) and three stable stationary points defining correlated 
equilibria at (po^go) = (0,0), (0, 1), and (1,0). The respective payoffs for each player at 
these correlated equilibria points are ((11^), (11^)) = (5,5), (2,7), and (7,2). There is 
then an additional correlated equilibria giving an increased expected payoff for each player 
motivating them to use the additional available information to correlate their strategy 
choices to their opponent's moves. 

The location of a correlated equilibrium point with improved payoffs to both play- 
ers, ((n^), (n^)) = (5,5), lying strictly outside the convex hull of the Nash equilibrium 
payoffs concludes Aumann's example. To reiterate, every change of the physical random- 
ization device adopted by players, whether secret or public, must be modelled by altered 
probability spaces. Aumann introduced these tools to model correlated equilibria gener- 
ated by players sharing a public randomization device and shared communication. This 
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Figure 5.4: The flow diagram showing the direction of the gradient of the respective ex- 
pected payoffs [-^ — -, -^ — -] identifying two unstable stationary points at {po,qo) = (|) |) 
and (|, |) (open circles), as well as three stable stationary points locating correlated equi- 
libria at {po,qo) = (0,0), (0,1), and (1,0) (closed disks). The respective payoff's at the 
correlated equilibria are ((11"^), (H^)) = (5,5), (2,7), and (7,2). 



communication means that novel correlated equilibria can be located even in two-player 
single stage games. 

In contrast, our work with isomorphic constraints based on correlations eschews any 
additional communication between the players. Rather, players can adopt different secret 
randomization devices modelled by altered probability spaces possessing different dimen- 
sionality, continuity properties, differentiability conditions, and gradients, all of which 
allow the location of novel equilibria. The continued absence of communication between 
the players means that, as far as we can tell, novel constrained equilibria appear only in 
multiple-player- multiple-stage games. 



Chapter 6 

The chain store paradox 



6.1 Introduction 

The chain store paradox examines predatory pricing to maintain monopoly profits. It 
gains its "paradoxical" moniker as (so it has been argued [21]) a substantial proportion of 
the economics profession finds itself disagreeing with the clear predictions of game theory 
in this game. That is, many economists would hold that it is irrational for any firm 
to engage in predatory pricing to drive rivals out of business and so gain a monopolist 
position as predation is costly to the predator while potential new entrants well under- 
stand that any price cutting is temporary. It is also generally held that any attempt to 
extract monopoly pricing benefits in some industry would quickly attract new entrants 
so any monopoly gains will be short lived. An extensive literature has demonstrated the 
implausibility of these claims, with Ref. [23] examining predatory pricing in the shipping 
industry, IBM pricing strategies against competitors, and coffee price wars, for instance. 
Selton first proposed the chain store paradox as a complement to the finite iterated 
prisoner's dilemma |25] in order to highlight inadequacies in game theory. These lacks 
would then justify the necessity of bounding rationality in game theory. Terming the 
conventional game theoretic analysis and predicted outcome as the "induction" argument, 
and contrasting this with an alternate "deterrence" theory, Selton noted 

"...only the induction theory is game theoretically correct. Logically, the 
induction argument cannot be restricted to the last periods of the game. 
There is no way to avoid the conclusion that it applies to all periods of the 
game. 

Nevertheless the deterrence theory is much more convincing. If I had to play 
the game in the role of [the monopolist], I would follow the deterrence theory. 
I would be very surprised if it failed to work. From my discussions with friends 
and colleagues, I get the impression that most people share this inclination. 
In fact, up to now I met nobody who said that he would behave according 
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to the induction theory. My experience suggests that mathematically trained 
persons recognize the logical validity of the induction argument, but they 
refuse to accept it as a guide to practical behavior. 

It seems safe to conjecture that even in a situation where all players know that 
all players understand the induction argument very well, [the monopolist] will 
adopt a deterrence policy and the other players will expect him to do so. 

The fact that the logical inescapability of the induction theory fails to destroy 
the plausibility of the deterrence theory is a serious phenomenon which merits 
the name of a paradox. We call it the 'chain store paradox'" |25] . 

Efforts to resolve the paradox include recognizing that players might not be sure that 
their opponents are rational payoff maximizers due to the impact of mistakes or trem- 
bles, rationality bounds, incomplete information, or altered definitions of rationality, all 
of which necessitate use of subjective probabilities |2S]- In addition, introducing asym- 
metric information whereby entrants are uncertain whether monopolists are governed by 
behavioural rules which eliminate common knowledge of rationality and provide a ratio- 
nale for entrants to base their expectations of the monopolist's future behaviour on its 
past actions [24j , while the use of imperfect information or uncertainty about monopo- 
list payoffs allows the replication of observed behaviours [27j. Other approaches include 
dropping common knowledge of rationality [28], or by introducing incomplete and im- 
perfect information [29]. For a good review of how this paradoxical game contributes to 
economic understanding appears, see [5U] . 

Selton's construction of the paradox hinges on the use of "deterrence" theory in a mul- 
tiple stage game (involving repeated choices by the monopolist), whereby the monopolist 
can adopt a non-rational strategy in early stages of the game to build a reputation for 
implementing that strategy which induces their opponent's to alter their own choices in 
latter stages. All subsequent treatments have followed Selton in modelling such multiple 
stage games and have then introduced some mechanism to justify "reputational" effects. 

In contrast, in our treatment here, by introducing isomorphic constraints into our 
strategy spaces, we can establish that it is rational for the monopolist to adopt the seem- 
ingly irrational choice even in a minimal game (where the monopolist makes a single 
response to a single entrance) where it is commonly thought that reputation or deter- 
rence effects cannot make an appearance. The conventional analysis of this minimal 
game is immediately solved via backwards induction dependent on the assumptions of 
a common knowledge of rationality (CKR), independent behavioural strategies defining 
separable joint probability distributions and allowing subgame decompositions. In our 
extended analysis, the adoption of isomorphically constrained joint probability spaces 
allows non-independent behavioural strategies described by non-separable joint proba- 
bility distributions all of which invalidate subgame decompositions and alter the optima 
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located via backwards induction. We demonstrate this now. 




(^^^Y): (01) (10) (-1-1) 

Figure 6.1: A minimal chain store game decision tree in an unconstrained behavioural 
space where a potential new market entrant X must decide to either stay out of a new 
market x = with probability 1 — p or enter the market x = 1 with probability p, in which 
case the monopolist Y chooses to either acquiesce y = with probability 1 — g or fight 
y = I with probability q their entry, with the corresponding payoffs shown. 



6.2 The chain store paradox 

The minimal chain store paradox, conventionally pictured in Fig. 16.11 is defined over two 
sequential stages where first, a potential market entrant X must decide to either stay out 
of a new market a; = or enter that market a; = 1 where their opponent, the monopolist 
Y, observes this choice. Should X stay out of the market, they neither gain nor lose any 
payoff while Y gains monopolist profits so (Jl^ , H^) = (0, 1). In contrast, should X enter 
the market, Y must then decide whether to acquiesce to their opponent's entry y = Ohj 
leaving prices unchanged and losing profits so (11^,11^) = (1,0) or by driving X out of 
business by price cutting so payoffs are (H^, 11^) = (—1,-1). 

6.2.1 Unconstrained behaviour strategy spaces 

A standard analysis frames the behaviour strategy spaces of each player as being 

Pf = {xe{0,i},{i~p,p}} 

Vl = {i/G{0,l},{l-g,g}|a; = l}. (6.1) 

Here, player Y chooses their value of y only when advised that x = 1. In the joint 
behaviour space V^ x P^, the respective optimization problems for the players are 

X : max (11 ) = p — 2pq 

F : max (n^) = 1 - p - pq, (6.2) 
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so the only independent parameters are p and q. In this joint space, the gradient operator 
used by each player in their analysis is 



d_ d_ 
dp^ dq 



(6.3) 



so optimal solutions are obtained via 

dp 
dr 



l-2q 



-p. 



(6.4) 



The solutions to these conditions are graphed in Fig. 16.21 Here, the gradient of the 
payoff for the monopolist Y is essentially always negative so Y sets g = and so always 
acquiesces to new market entrants. In turn, realizing this, X determines that the gradient 
of their payoff is always positive and so always sets p = 1 and decides to enter the market. 
There is also an equilibria at the point p = and q = 1, termed imperfect as it requires 
Y to adopt an irrational strategy (to fight) when X stays out of the market even though 
this intention cannot be sustained if indeed it turns out that X enters the market. The 
resulting expected payoffs given that players adopt the sole perfect Nash equilibria of 
p = 1 and g = are ((H^), (H^)) = (1, 0). 




Figure 6.2: The intersection of the gradient conditions specifying Nash equilibria for the 
minimal chain store paradox. The two Nash equilibria points are circled. 

It is useful to again remind ourselves how this conventional analysis without isomor- 
phism constraints models perfect correlations between x and y to show that the monop- 
ohst cannot rationally sustain a perfectly correlated strategy. Suppose that Y seeks to 
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perfectly correlate y with x via q = I. As usual, both players are perfectly capable of 
evaluating the expected payoff gradients in the appropriate limit to obtain 

hm V ' = lim(l-2g) = -1 

lim — - — = lim —p = —p. (6-5) 

g^l dq 9^1 

That is, even when the monopolist seeks to perfectly correlate their choice y with x, the 
non-zero gradients present at these points ensure they must rationally alter their intention 
so as to maximize their payoff. This conclusion is of course valid only when isomorphism 
constraints are absent so that behavioural strategy probability distributions are separable 
allowing subgame decompositions and optimization via backwards induction. Conversely, 
this result does not pertain when isomorphism contraints are in use. 

Rational players of unbounded capacity are able to alter their choice of probability 
space, and will optimize this choice so as to maximize their expected payoffs. In each 
alternate space, the generated joint probability distributions might well involve non- 
independent variables so the joint probability distributions are nonseparable preventing 
conventional subgame decompositions and ensuring that novel equilibria can be located. 
We now complete a partial search of the possible joint probability spaces. 

6.2.2 Isomorphically correlated space V^ x V^\q^i 

Suppose that player Y employs an isomorphism constraint q = 1 ensuring that variable 
y is perfectly correlated to x via y = x and y"^ = x"^ = xy = x. We denote this space 
V^\q=i. In this space, the optimization tasks facing the players are 

X : max 11 = —x 

X 

Y ■ n^ = \-2x. (6.6) 

It is immediately evident that player X maximizes their payoff in this space by setting 
a; = 0. The same result arises when expected payoffs are used where we have the relations 

(y) = (a;) and (t/^) = (x^) = (^xy) = (x) giving 



X : max (U^) 
p 



-p 



Y : {W) = l-2p. (6.7) 

As usual, the decision by Y to adopt the V]^\q=i probability space leaves them with no 
further decisions to optimize. The relevant gradient operator used by both players to 
analyze this particular probability space is 

V - 1 (6.8) 
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so optimization proceeds as usual via 

ensuring that player X chooses not to enter the market via j9 = giving x = 0. Conse- 
quently, this means that Y chooses y = but this setting does not influence payoffs. That 
is, when players (X, y ) adopt the V^ x VWq=\ joint probability space, they maximize 
their payoffs via the combination {x,y) = (0,0) to garner payoffs ((11"^), (n^)) = (0, 1). 
In short, the monopolist has deterred any new entry into the market so they retain their 
profit. The threat they made to retaliate was not empty and indeed, was sufficient to 
modify rational outcomes. 

6.2.3 The functionally anti-correlated space: V^ x V]^\q=o 

Alternatively, player Y might choose the alternate probability space V]^\q=o in which 
player Y chooses to functionally anti-correlate their y variable to the previous choice of 
X via y = 1 — X and xy = 0. In the joint probability space V^ x V]^\q=o, the expected 
payoff optimization problem becomes 

X : max 11 = x 

X 

Y : n^ = 1-x. (6.10) 

It is immediately evident that player X maximizes their payoff in this space by setting 
X = 1. The use of expected payoffs will lead to the same result as we have the relations 
(y) = 1 — (x) and (xy) = giving 

X : max (U^) = p 
p 

Y : (n^) = 1-p. (6.11) 

Again, the adoption of the V]^\q=o probability space leaves Y with no decisions to opti- 
mize. As a result, the gradient operator is again 

with optimization giving 

^ = 1. (6^13) 

ensuring that player X chooses to enter the market via p = 1 with x = 1. Consequently, 
this means that Y chooses y = but this setting does not influence payoffs. The result is 
that when players (X, Y) adopt the V^ x V]^\q=o joint probability space, they maximize 
their payoffs via the combination (x, y) = {(1, 0)} to garner payoffs ((11"^), (11^)) = (1, 0). 
In this space, X is undeterred and enters the market to garner the proflts 
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6.2 A Expected payoff comparison across multiple probability 
spaces 

Altogether, the various joint probabihty spaces which might be adopted by the players 
lead to a table of expected payoff outcomes of 



((n^), (n^)) 


V^ 


' B\q=0 


(1,0) 




(1,0) 


' B\q=l 


(0,1) 



(6.14) 



making it evident that to maximize their payoff, player Y must rationally elect to use 
probability space V^\q=i in preference to either V]^ or P^|q=o. That is, Y will under- 
take to functionally correlate their choice to the previous choice of the potential market 
entrant, and thereby deny themselves a choice about the setting of y once the game has 
commenced. They do this knowing it to be the payoff maximizing choice of probability 
space (among the few examined here). Knowing this, player X will not enter the market 
even in this minimal chain store game. Similar results apply for extended games with 
multiple markets and potential entrants. The clear prediction of our analysis is that 
players of unbounded rationality will always fight entrants in the chain store game even 
though this strategy appears to be non-rational when examined using conventional anal- 
ysis. That is, in the chain store game, a monopolist does not need to build a reputation 
for aggression over initial stages to try to discourage potential entrants in later stages. 
A monopolist, of unbounded rationality, is well aware that making a choice to adopt a 
probability space in which their choices are functionally assigned to be correlated to their 
opponent's is both payoff maximizing and rational. 

It is of course possible to consider a broader range of joint probability spaces for both 
players X and F, but these do not alter the conclusion here that it can be rational for a 
monopolist to punish market entrants to resolve the chain store paradox. 
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Chapter 7 
The trust game 



7.1 Introduction 

The previous chapter considered what conventional analysis holds to be anomalous ag- 
gression, anomalous as it decreases the payoffs of the aggressive player. In this chapter, 
we consider trusting behaviour where players transfer their own payoffs to their opponent 
in the hope that their opponent will return the favour and transfer an enlarged pool of 
funds back to them. Needless to say, the conventional analysis holds that each of these 
trusting actions is anomalous. In this chapter, we consider the single shot trust game. 

In earlier formulations, the trust game took place over repeated stages j31] allowing 
reputation and punishment theories to explain why players can exhibit trust and increase 
their payoffs over those predicted by game theory. Such results motivated investigations 
of single shot trust games (initially termed the investment game) where the minimal 
number of stages ensures that reputation and punishment effects are absent. Despite 
this, players continue to exhibit trust to increase their payoff [32j. More recently, players 
involved in the trust game have undergone functional magnetic resonance imaging of their 
brains during play [33j. Other minimal games eliminating reputation and punishment 
effects are the ultimatum and the dictator game among others. 

7.2 A simplified trust game 

In this section, we simplify the trust game as far as possible without losing any of its 
character. 



The minimal trust game, as conventionally pictured in Fig. 17.11 is defined over two 
sequential stages where first, player X possess a single unit of funds and must choose to 
either retain these funds x = generating payoffs of fn^,n^j = (1,0), or trust their 
opponent by investing their funds with Y via x = 1. Should this investment occur, 
both players are aware that Y receives three units and must then decide how much 
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Figure 7.1: A minimal trust gam,e wherein player X possesses funds of one unit and 
must choose to either retain these funds a; = generating payoffs of fll'''^,!!-^) = (1,0) 
or trust their opponent by investing their funds with Y via x = 1. Should this investment 
occur, both players are aware that Y receives three units, and must then decide how 
much of this total to keep and how much to return to X . That is, Y decides to retain 
an amount y G {0, 1, 2, 3} while returning an amount 3 — y to X generating payoffs of 

(n^,n^) = (3-y,y). 

of this total to keep and how much to return to X. That is, Y decides to retain an 
amount y G {0, 1, 2, 3} while returning an amount 3 — y to X generating payoffs of 
(ll^ , n^ j = {3 — y,y). Altogether, the payoffs to the players are 

n^ = l-x + x{3-y) 

n^ = xy. (7.1) 

7.2.1 Unconstrained behaviour strategy spaces 

Conventional game analysis commences with the assumption that players X and Y each 
adopt a probability space lacking isomorphism constraints. Possible spaces include 

P| = {xe{0,i},{i-p,p}} 

Vl = {yG{0,l,2,3},{g,r,s,t}|x=l}. (7.2) 

Here, player Y chooses their value of y only when advised that x = 1 and we have the 
normalization condition q + r + s + t=l. In the joint behaviour space V^ x V]^, the 
respective optimization problems for the players are 

X : max (11^) = 1 - p + p{3q + 2r + s) 

r:max(n^) = p(3 - 3q - 2r - s). (7.3) 

q,r,s 
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The only independent variables here are p,q,r and s (subject to normalization con- 
straints) so the relevant gradient operator is 



d d d d 
dp^ dq^ dr^ ds' 

Consequently, optimal solutions are obtained via 



(7.4) 



dp 
dq 
Or 
ds 



-1 + 3g + 2r + s 

— 3p 

-2p 

-p. 



(7.5) 



The last three equations here straightforwardly show that Y maximizes their expected 
payoff by setting g = r = s = ensuring t = 1 to give y = 3. In turn, this result 
simplifies the optimization condition for X establishing that X maximizes their payoff 
by setting p = to give a; = 0. The Nash equilibria for this simplified trust game is 
then (x, y) = (0, 3) so both X and Y selfishly retain all the funds they can generating 
expected payoffs of ((H^), (H^)) = (1,0). 

As noted previously, these payoffs are not optimal as they could be improved by both 
players adopting different choices, as is commonly observed in human play. 



n^n* 




1,0 



^-y^y 



Figure 7.2: The case where players {X, Y) adopt the V^ x V]^\y=y joint probability space 
where player Y functionally correlates their second stage choice to their opponent's first 
stage choice. In this case, a decision by X to invest funds with Y automatically invokes 
a partial return of funds. 



7.2.2 The isomorphically correlated space V^ x V]^\y^y 

Rational players are able to alter their choice of probability space, and will optimize 
this choice so as to maximize their expected payoffs. Suppose that player Y considers 
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an alternate probability space denoted V}^\y=y in which the choice of the variable y is 
determined by the preceding choice of x via 



y = 3(1 — x) + xy. 



(7.6) 



This means that when a; = we have y = 3 while the choice x = 1 enforces the setting 
y = y for y G {0,1,2,3}. This possibility is shown in Fig. 17.21 Noting we still have 
x"^ = X and x{l — x) = 0, the payoffs to each player are 



X : max U^ 



Y : n 



Y 



l + x{2-y) 
xy. 



(7.7) 



It is evident that player X will set x = 1 provided y < 2 and x = when y > 2. They 
are indifferent when y = 2 and so will play safe with x = 0. The same results appear 
when the expected payoffs are maximized via 



X : max (U^) 
p 

Y : (n^) 



1 + 2p — py 
py. 



(7.8) 



The relevant gradient operator is 



d_ 
dp 



(7.9) 



and optimization proceeds via 



a(n^) 

dp 



y)- 



(7.10) 



As a result, X maximizes their payoff by setting p = 1 whenever y < 2, and p = 
otherwise. Subsequently, because Y has left themselves no free choices during the game, 
the outcomes (y, x, y, (H^), (H^)) are respectively (0, 1, 0, 3, 0), (1, 1, 1, 2, 1), (2, 0, 3, 1, 0), 
and (3,0,3,1,0). 
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7.2.3 Expected payoff comparison across multiple probability 
spaces 

The optimal payoffs in the various joint probabihty spaces considered here which might 
be adopted by the players are 



((n^),(n^)) 


P| 


n 


(1,0) 


'^B y=o 


(3,0) 


' B y=i 


(2,1) 


^B y=2 


(1,0) 


T^B 2/=3 


(1,0) 



(7.ii: 



This makes it evident that to maximize their payoff, Y must rationally elect to use the 
joint probability space V]^\y=i in preference to any alternate probability space considered 
here. That is, player Y will undertake to functionally correlate their second stage decision 
to the previous choice of their opponent, and thereby deny themselves a second stage 
choice during the game knowing this to be the payoff maximizing choice. Knowing this, 
X is confident enough to send all of their funds to Y with the clear expectation of making 
a profit. This prediction of our extended analysis is in accord with observation. 
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Chapter 8 

The ultimatum game 



8.1 Introduction 

The prevalence and importance of bargaining in society justifies the examination of simple 
bargaining models such as the ultimatum game, particularly in view of the discrepancy 
between observed player strategies and rational equilibrium solutions [34j. In the ultima- 
tum game, two players must divide an item of equal utility to both (generally money). 
One player, the proposer, offers a proportional division to the other, the responder, who 
must either accept it in which case the division proceeds as suggested, or reject it in which 
case neither player receives any money. The assumption that players are rational and 
payoff maximizing allows derivation of the subgame perfect equilibrium where in each 
stage the proposer offers the smallest positive amount of money possible which the re- 
sponder accepts as receiving some amount of money, however small, is always better than 
receiving none. This solution is seldom observed in experiments making the ultimatum 
game an ideal vehicle for testing the assumptions of game theory. 

This role as a game theory test-bed has long been explored [35], [36l EH ETJ [3H1 EHl 
Ho] , and tested by many experiments including examination of the influence of variable 
stake sizes [HI |l2l US] and of culture [HI |15]. See experimental surveys in [l6l HTJ 
SH]. Experimental results typically demonstrate offers closer to a fair split (50%), and 
frequent rejections of offers even substantially above 0% (approximately the predicted 
equilibrium offer). Further, more detailed analysis shows that players, while failing to 
locate the subgame perfect equilibrium, are performing a sophisticated matching of offers 
to acceptance probabilities so as to maximize payoffs [19], while the ability to track a 
changing game environment demonstrates that proposers can be induced to vary their 
offer ranges and that responders can expand their acceptance sets — in effect offers and 
acceptances are contingent on the possibly changing game environment |50] . 

Proposed modifications to game theory to generate the observed payoff maximiz- 
ing behaviour have focused on introducing mechanisms to complement player self- 
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interest. In the main, these proposed additions either exploit modified utihty func- 
tions interdependent on both player's payoffs by taking account of psychological factors 
(so player utility increases with player equity or player intentionality say), or by em- 
bedding the ultimatum game within a larger, perhaps societal game (taking account of 
player reputation and self image for instance). These differing approaches include fairness 
[SSlEIlESlllllESlESlIMlESlEniEZ], though with equity definitions generally self-serving 
and modified by player information and payoff asymmetries [58], rivalry |[59], reciprocity 
[601 [61], envy [62], punishment and revenge [63], competition and cooperation [64|, al- 
truism and spitefulness [65], and reputation [66]. In these approaches, player strategies 
effectively become contingent on both player's payoffs generating novel equilibria allowing 
more equitable play. 

Player learning can be modelled via algorithms modifying current strategy selections 
(offers and acceptance probabilities) in the light of prior game events [121 EZ] which 
again makes player strategies contingent on those of their opponents to generate novel 
equilibria. See also [681 EHl HO]- Essentially the same algorithm can be implemented at 
the population level using evolutionary games theory in which players observe and learn 
about previous acceptances and rejections of other players and modify their strategies 
accordingly [71], or simply learn which payoff splits maximize payoffs ^2]- See also 
[731171]. Again, these approaches effectively make current strategy choice contingent on 
prior game events to generate novel equilibria. 



x: 




10 1 



(n^n'^): (00) (M-1,1) (00) (M-2,2) 



Figure 8.1: A conventional tree of the two stage ultimatum game. In this decision tree, 
X makes an integral offer 1 < x < M with probability p^ to Y who either accepts the 
offer by choosing y = I with probability qx or who rejects the offer by setting y = with 
probability 1 — q^. If the offer is accepted, the player payoffs are {H^ , iV^) = (M — x, x) 
while if the offer is rejected, player payoffs are {H^ , 11^) = (0, 0). 
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8.2 The Ultimatum game 



As shown in Fig. I8.H the ultimatum game is defined here over two sequential stages 
where first X communicates an integral offer 1 < x < M to F. Player Y must then 
decide whether to accept the offer by choosing y = 1 in which case Y keeps the offer 
amount x and X receives an amount M — x. Alternately, Y rejects the offer by choosing 
y = in which case neither player receives any payoff. That is, the payoffs are 

n^ = (M - x)y 

n^ = xy. (8.1) 

A quick optimization analysis (achieved by straightforwardly embedding the discrete 
payoffs in the corresponding continuous functions) has 

^- = -Z/<0 
ox 

^ ^ x.>0. (8.2) 

indicating that player X can increase their payoff by setting x as small as possible, so 
a; = 1, while player Y increases their payoff by making y as large as possible, so y = 1. 
This gives the equilibrium point (x, y) = (1, 1) generating payoffs of (11^', 11^) = (M— 1, 1). 
However, few human players adopt this equilibrium point. 

A more detailed analysis has players seeking to alter their choices of probability spaces 
V^ and P^ so as to maximize their respective payoffs. As previously, players must 
determine which joint probability space defining the joint probability distributions will 
optimize payoff outcomes. 

8.2.1 The isomorphically unconstrained space: V^ x V]^ 

The conventional analysis of the ultimatum game commences with players X and Y each 
adopting a probability space lacking isomorphism constraints. Possible spaces include 

Pf = {xe{l,2,...,M},{p,,p2,...,PM}} 

Vl = {yG{0,l},{P^(y = 0|a; = z) = (l-g,),P^(y = l|x = z) = g„Vz}}.(8.3) 

Here, we have the normalization condition X^iPi = 1- 

In the joint behaviour space V^ x P^, the respective optimization problems for the 
players are 

M 

X:^max^ (H^) = q^[M - I) -Y,p,\q,{M -I) - q,{M - i)] 

i=2 
M 

Y : max (H^) = gi + ^Pi(gii - gi). (8.4) 

i=2 



qi,---,qM 
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We have here resolved the normahzation condition via pi = 1 — J2if=2Pi- Consequently, 
the expected payoffs are continuous multivariate functions dependent on the probability 
parameters (p2, • • • , Pm, Qi, ■ ■ ■ , Qm), so the relevant gradient operator used by both players 
to analyze this particular probability space is 



d d d d 



_op2 opM oqi dqu 

Immediately then, the optimization conditions evaluated by each player are 

9(n^) 



^.5) 



dpi 
dqi 



_[(M-l)gi-(M-2)g,], V^ G [2, M] 

ipi \/i G [1,M]. (8.6) 



The conditions for rates of change of F's payoff with respect to gi, . . . , qm here are all 
non-negative ensuring that Y sets qi = . . . = qM = 1 and thus accepts any offer from X 
greater than or equal to x = 1. In turn, these determinations simplify the optimization 
conditions for X wherein the rates of change for X's payoff with respect to all of p2, • • • , Pm 
are negative so X sets p2 = . . ■ = pu = and pi = 1. The resulting choices by each 
player are {x,y) = (1, 1) generating expected payoffs of ((11'^), (n^)j = (M— 1, 1). This 
is the unique Nash equilibrium point for this ultimatum game, given the adoption of the 
joint probability space V-^ x V^. Unfortunately, it is not an equilibrium adopted by 
many human players. 

Rational players will be very aware that both they and their opponent can alter 
their choice of probability space, and will optimize this choice so as to maximize their 
expected payoffs. In these alternate spaces, the random probability variables used in 
player optimizations might well be non-independent so joint probability distributions are 
nonseparable preventing conventional subgame decompositions and ensuring that novel 
equilibria can be located. We illustrate this now accomplishing, as usual, only a partial 
search of the available infinity of probability spaces. 



8.2.2 An isomorphically constrained space: V^ x ^s|y=y 

Suppose that player Y adopts one of a possible M — 1 alternate probability spaces V^\y=y 
for integral 2 < y < M in which they correlate their y variable with the previous value x. 
In particular, suppose that Y undertakes to reject any offer x less than y and to accept 
any offer x equal to or greater than y. That is Y adopts the functional assignment 
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In other words, we have y = 5x>y giving the payoff functions 



X : max H^ 



(M - X)5^>y 



Y ■ n 



Y 



x5. 



x>y 



It is then evident that player X will set x = y to maximize their payoff at 11 = (M — y) 



giving player Y a payoff of 11 = y. Similar results are obtained from optimizing the 
expected payoff functions obtained using the probability distribution 



P'iy\x) 



( pY^y=0\^) = l_J2fLy5,. 



[P^{y = l\x) = Ef=yS,.. 



(8.9) 



Players of unbounded rationality must then sequentially assume that players X and Y 
have adopted the joint probability space V^ x V}^\y=y for 2 < y < M, and within 
each space, locate the constrained equilibria optimizing outcomes, all of which can be 
subsequently compared in a later comparison table. We complete this process now. 



(n^n^): 




(M-jJ;) (M-y -1^+1) 



Figure 8.2: The case where players (X, Y) adopt the V^ x V]j\y=y joint probability space 
where player Y is functionally constrained to reject any offer x < y and to accept any 
offer X > y. As a result offers of a lesser amount appear neither in the expected payoff 
functions nor in the corresponding game tree. 



With the adoption of the joint probability space V^ x V^\y=y, and taking account 



of the the normalization condition py 
problems for the players becomes 



1 — J2i=y+iPi, the expected payoff optimization 



X : max (H^) 

Py+1,-,PM 



Y:{U') 



.it=y+ 

(M-y) 

M 
i=y 



M 
i=y+l 



^.10) 



which are now dependent only on the freely varying parameters {py+i, . . . ,Pm)- That 
is, given their previous choice of probability space, player Y has no further independent 
parameters, while player X is indifferent to any choice with I < i < y because these 
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variables have disappeared from the problem specification. The resulting game tree is as 
shown in Fig. 18.21 The relevant gradient operator used by both players to analyze this 
particular probabihty space is 

d 



[dPy+l 

Optimization then proceeds as usual via 

opi 



d 

dpM 



Vie [y+l,M]. 



8.11 



(8.12) 



All of the terms on the right hand side are negative ensuring that player X sets Py+i = 
. . . = pm = 0. In turn, this means that X sets Py = 1 and only ever offers x = y. 
(When y = M, player X gains zero payoff regardless of their offer and so is indifferent.) 
Consequently, in the joint probability space V^ x V]^\y=y, players {X,Y) choose the 
combination (x,y) = {{y, 1)} to garner payoffs ((n^), (If^)) = {M — y,y). 

8.2.3 Payoff comparison across probability spaces 

The above analysis has considered a total of one conventional joint probability space 
V^ X P^ and M—l alternate probability spaces V^ x V'^\y=y ioi 2 < y < M. Altogether, 
the various joint probability spaces adopted by the players lead to a table of expected 
payoff outcomes of 



((n^), (n^)) 


v§ 


n 


(M-1,1) 


Ky^2 


(M-2,2) 


Vb y=M-2 


(2,M-2) 


Vb y=M-l 


(1,M-1) 



(8.13) 



making it evident that to maximize their payoff, player Y must rationally elect to use 
probability space VB\y=M-i in preference to V^- Knowing this, player X will offer 
X = [M — 1) to y to ensure that they gain a payoff greater than zero. 



8.2.4 An indicative solution reflecting symmetries 

Obviously, in normal play of the ultimatum game, X does not normally expect that 
they need to offer all of the available funds to avoid rejection, and Y seldom elects to 
reject every offer less than all of the funds. This might result as the game is now highly 
symmetric. 

A conventional analysis shows that player X can garner a payoff of M — 1 and force 
Y to accept a payoff of 1. The isomorphic constrained analysis here shows that Y can 
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force a payoff of M — 1 for themselves leaving X with a minimal payoff of 1. Player X, 
facing a minimal payoff of 1 could then seek to modify their own probability space and 
undertake to not even consider offers greater than x say. It is possible that an extended 
analysis taking account of the ability of both X and Y to veto offers will settle in a choice 
around x = y = M/2 or thereabouts. 

The analysis presented here is indicative only and we do not attempt to resolve the ul- 
timatum game. It suffices for our purposes to show that including isomorphic constraints 
within the strategy spaces of the ultimatum game allows a broader range of equilibria 
outcomes than considered by conventional game theory. 

8.3 Discussion 

This paper presents an analysis of isomorphically constrained play in the finitely iterated 
Ultimatum game. The use of isomorphic constraints reduces the dimensionality of the 
game strategy spaces and can modify game properties and equilibrium points. We suggest 
that these constraints are routinely exploited in human play to maximize player outcomes. 
We crudely suggested that fair play might be one possible outcome of our extended 
analysis. 

Experiments across a wide range of cultures show human players as commonly adopt- 
ing fair play. This carries the implication that human game players in a diversity of 
cultures have a natural ability to exploit isomorphic constraints to their own ends. Fur- 
ther, we suggest that use of isomorphic constraints are common in bargaining situations 
and in economics in general, and it is necessary that games theory be able to properly 
model these isomorphic constraints in strategic interactions. Further, our analytical ap- 
proach is likely to be more broadly applicable to the wider economic sphere as modeled 
by game theory. 
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Chapter 9 

The public goods game 



9.1 Introduction 

There are many situations in which a number of players must jointly participate in 
creating some common resource but where no player can be prevented from exploiting 
that resource. This creates a "free-rider" or "tragedy of the commons" style problem as 
while all players benefit if the public good is provided, any individual player can increase 
their benefits if they avoid paying their share of the costs [75]. As a result, players do 
not cooperate and the public good is not provided. These results are altered if players 
are able to punish free riders, even when punishment carries significant costs to the 
initiator [7^. The public goods game allows experimental examination of how norms 
of cooperative behaviour are established and enforced using a wide range of theoretical 
approaches [771 [THl [791 180] , including a proposed quantum solution [8T] . 

9.2 A simplified public goods game 

Here as usual, we simplify the public goods game as far as possible without losing any of 
its character. In particular, we restrict the number of players to two, designated as usual 
X and Y, and also restrict both the amounts that can be exchanged and the amounts 
used to punish opponents. 



The minimal public goods game, as pictured in Fig. 19.11 is defined over two sequential 
stages. In stage one, players X and Y both choose whether four units of payoff is either 
retained Xi,yi = or invested Xi,yi = 1. The return to each player from their own 
investment is negative whilst the return to them from their opponent's investment is 
positive. The payoffs to the players from their joint actions in stage one are 

nf = A-xi + 3yi 

nf = 4 + 3x1-1/1. (9.1) 
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Figure 9.1: A minimal public goods gam,e involving two players X and Y who simul- 
taneously choose to make an investment of some amount Xi,yi G {0,1} in stage one. 
The return to each player of their own investment is negative whilst the return to them 
from their opponent's investment is positive. Thus, investment is a public good which 
creates a free rider problem. In the second stage, each player can choose to either punish 
their opponent for their first stage actions X2,y2 = I at some cost to themselves, or not 
^2,1/2 = 0, with the corresponding payoffs shown. 



Thus, should both X and Y make no investments via Xi = yi = then their payoffs are 
n^ = nf = 4 while if both invest all their funds via xi = yi = 1 then their payoffs are 
improved to 11^ = Ilf = 6. Unfortunately however, it pays for each player to free ride 
on their opponent's investment: should X invest their funds Xi = 1 while Y retains all 
of their funds yi = 0, the joint payoffs are fn^,nf j = (3,7), making it tempting for 
Y to free ride. Conversely, should X retain their funds while Y invests, the payoffs are 
f n^, nf j = (7, 3). The net result is that game theory predicts that both players attempt 
to free ride on the investment of their opponent resulting in non-Pareto optimal payoffs. 

The willingness of players to incur costs to punish their free riding opponents can 
then be studied by adding a second stage as shown in Fig. 19. 1[ Here, each player can 
choose to either not punish their opponent X2,y2 = leaving all payoffs unchanged, or 
can choose to punish their opponent X2,y2 = 1 at some cost to themselves. That is, 
should a player choose to punish their opponent, they decrease their payoff by four units 
while at the same time decreasing their opponent's payoff by eight units. Consequently, 
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by the end of stage two, the joint payoffs are 

n^ = 4 + Sxi - yi - 8x2 - Ay2. (9.2) 

It is this two stage form of the game that generates significant discrepancies between 
game theoretic predictions and observed human play. In particular, because punishment 
is costly then game theory makes the firm prediction that rational players will never 
choose to punish their opponents. However, precisely the opposite tends to occur in 
practise. People exhibit a strong tendency to punish their free riding opponents even 
when this reduces their own payoffs. Herein lies the interest in the public goods game. 

9.2.1 Unconstrained behavioural strategy spaces: V^ x V^ 

Conventional game analysis commences with the assumption that both players X and Y 
together adopt a joint probability space V^ x V^ in which every behavioural strategy 
on every history set is independent. One possibility for the joint behavioural strategy 
space is shown in Fig. 19.11 We have chosen a terminology allowing the expected payoff 
function for player Z G {X, Y} to be written as 

1 
Z : max (H^) = ^ P^^{xi,yi,X2,y2)'n.^{xi,yi,X2,y2) 

1 
J2 P^(xi)P^(yi)P^(a;2|a;ii/i)P^(l/2kiyi)H^(xi, y,, X2, 2/2) 

xi,yi,X2,y2=0 
1 
= J2 PxiqyiPx2\xiyiqy2\xiyi'^^iXi,yi,X2,y2)- (9.3) 

^i,yi,x2,y2=o 

We also have implicit normalization conditions such as po+pi = 1 and Po\xiyi +Pi\xiyi = 1, 
and so on. The expected payoff functions for each player are then 

X: max (H^) = 4 - (xi) + 3(yi) - 4(^2) - 8(^2) 

Pl,Pl\xiy-i 



1 



4 - Pi + 3gi - 4 Yl PxiqyiPx2\xiy^X2 - 8 J2 Pxiqyiqy2\xiyiy2 

a:^ij/ia:^2=0 Xiyiy2=0 

1 1 

4-pi + 3gi-4 J2 PxiQyiPllxiyi - 8 ^ Px^Qyiqilxiyi 
xiyi=0 xiyi=0 



Y: max (H^) = 4 + 3(xi) - (yi) - 8(0:2) - 4(2/2) 

1 



4 + 3pi - gi - 8 Y^ Px^qyiPx2\x^yiX2 - 4 ^ Pxiqy,qy2\xiy,y2 
xij/iX2=0 xiyiy2=0 

1 1 

4 + 3pi-gi-8 Y PxiqyiPl\xiy, - 4: J2 Pxiqyiqi\xiy,- (9.4) 



xiyi=0 xiyi=0 
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Here, the expected payoff functions are continuous multivariate functions dependent on 



tlie probability parameters 
relevant gradient operator is 

d d d 



Pi,Pi\oo,Pi\ouPi\io,Pi\n 



d 



d d 



and 
d 



Q'l;Q'l|005 9l|Ol5 9l|10?Q'l|ll 

d d d 



so the 



. (9.5) 



dpi' (9pi|oo' dpi\oi' <9pi|io' <9pi|ii' dqi' dqi\oo' dqi\oi' <9gi|io' <9gi|iij 
Normalization conditions mean that any term dependent on po or Po\xiyi contributes a 
negative term to any gradient with respect to pi or Pi\xiyi respectively. Similar consider- 
ations apply to the q parameters. 

Taking account of normalization, the optimization conditions evaluated by each player 
are 



a(n 



^\ 



dpi 

dpi\oo 
dpi\oi 
dpi\io 
•9^1111 
dqi 

<9gi|oo 
<9gi|oi 
«9gi|io 



-1+4 ^ gyi {Pl\Oy^ -Pl\ly, 

yi=o 
-4pogo 

-Apoqi 
-Apiqo 
-ipiqi 



2/1=0 



xi=0 



a:i=0 



(9g 



i|ii 



-1 + 8 X! P^i {Pi\xi0 - Pi\x,i) + 4 ^ p^j {qi\xiO - gi|xii 

-4j5o9o 
-Apoqi 
-Apiqo 
-^PiQi, 



(9.6) 



Thus, player X finds the rate of change of their payoff with respect to pi\ij is always 
negative so they set pi\ij = for all i and j. Similarly, player Y sets qi\ij = as the 
rate of change of their payoff with respect to qi\ij is also always negative for all i and 
j. That is, there are no histories in which it is payoff maximizing for either player to 
punish their opponent. In turn, these results simplify the remaining two conditions for 
first stage moves giving 

9(n^) 



dpi 
dqi 



-1 
-1. 



(9.7) 
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This establishes that both players maximize their expected payoffs by setting pi = 
and gi = in the first stage. Thus, both players make no investment in the first round 
confident in the knowledge that their opponent will not punish them for this. The 
Nash equilibria for this simplified public goods game is then {xi,yi,X2,y2) = (0,0,0,0) 
generating expected payoffs of ((11"^), (n^)j = (4,4). As noted previously, these payoffs 
are not Pareto optimal as they could be improved by both players adopting different 
choices, as is commonly observed in human play. 

Rational players are able to alter their choice of probability space, and will optimize 
this choice so as to maximize their expected payoffs. We here suppose that players might 
each consider a total of two alternate probability spaces. 







^' X "Pr |,„=i_^i joint prob- 



Figure 9.2: The case where players {X,Y) adopt the V^ \x2=i-yi ^ '''B\y2=i 
ability space where both players functionally anti- correlate their second stage choices to 
their opponent's first stage choices. Then a failure to invest automatically invokes pun- 
ishment while investment invokes no punishment. 



9.2.2 Isomorphically anti-correlated space V 



B \x2^l-yi ^ I B \y2=^-xi 



Suppose first that both players X and Y choose to adopt a joint probability space 

I R .t;o = 1— «i ^ ' 



B \x2=i-yi ^ '"b \y2='i--xi as shown in Fig. 19. 2[ in which they each functionally anti- 
correlate their second stage choices to the previous choices of their opponents. This is 
implemented via 



X2 


= i-yi 


Pl\xiyi 


= <^l,{l-J/i) 


1/2 


= 1 — Xi 



1l\ 



xiyi 



(^l.Cl-a;!)- 



(9.8) 
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This choice of probabihty space alters the dimensions of the game space, the game 
trees, and the payoff functions to be 

n^ = 4-x^ + 3yi-4x2-8y2 

= 4 - xi + 3yi - 4(1 - yi) - 8(1 - xi) 

= -8 + 7x1 + 7yi 

n^ = 4 + 3xi - yi - 8x2 - 4^/2 

= 4 + 3xi-i/i-8(l-yi)-4(l-xi) 

= -8 + 7x1 + 7yi. (9.9) 

It is then immediately evident that players maximize their own payoffs by choosing to 
invest (xi, yi) = (1, 1) which invokes a subsequent lack of punishment in stage two giving 
(3^2,1/2) = (0,0). The final payoffs are then (11"^, 11^) = (6,6). 

Optimization of the expected payoffs must reproduce this result. The isomorphically 
constrained expected payoff functions can simply be read from the tree in Fig. 19.21 and 
are 

X : max (H^) = -8 + 7pi + 7qi 
pi 

F:max(n^) = -8 + 7pi + 7gi. (9.10) 
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These expected payoffs are continuous multivariate functions dependent only on the freely 
varying parameters pi and gi, so the relevant gradient operator used by both players is 



d d 
dpi ' dqi 

Immediately then, the optimization conditions evaluated by each player are 

5(n^) 



(9.11) 



dpi 



7 

7, (9.12) 



dqi 

ensuring that both players X and player Y maximize their expected payoffs by investing 
their funds by setting pi = 1 giving xi = 1 and qi = 1 giving yi = 1. The functionally 
assigned punishment choices then ensure that neither player punishes the other so the 
equilibria choice of play is (xi,?/i,X2,|/2) = (1, 1,0,0) generating expected payoffs of 

((n^),(n^)) = (6,6). 

9.2.3 Anti-correlated and independent space: V^\x2^i-yi x 'Pb 

To complete this simplified analysis of the reduced public goods game considered here, 
both players might also examine the possible joint probability space "^^^2=1-1/1 ^ '^b i^ 
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Figure 9.3: The case where players {X,Y) adopt the 'P^\xi=i-yi x V]^ joint probability 
space where X functionally anti- correlates their second stage choices to their opponent's 
first stage choice and so automatically punishes a failure to invest, while Y adopts every- 
where independent behavioural strategies in both their stages. 

which X anti-correlates their second stage choice to their opponent's first stage choice 
while Y does not employ any isomorphic constraints — see Fig. 19.31 (Symmetry allows 
these results to be used for the space V§ x 'P]^\y^=i-xi after an appropriate reflection.) 
The required functional anti-correlations are implemented via 



X2 = 1 - Z/l 
Pl|xij/i = <^l,(l-yi)- 



(9.13) 



In the adopted probability space, the payoff functions for the players are then 



n 



X 



n 



Y 



4 - xi + 3yi - 4x2 - 8y2 

4 - xi + 3yi - 4(1 - yi) - 8y2 

-xi + 7yi - 8y2 

4 + 3xi - yi - 8x2 - 4^/2 

4 + 3x1-1/1- 8(1 -yi)-4|/2 

-4 + 3x1 + 7yi - Ay 2. 



(9.14) 



Here, player X sets xi = to maximize their payoff while Y sets yi = 1 and 2/2 = to 
maximize their payoff. The final outcome is (11^,11^) = (7,3). 

A similar result is obtained from optimizing the expected payoff functions. The iso- 
morphically constrained joint probability space V'§\x2=i-yi x P^ specifies the expected 
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payoff optimization problem after the resolution of the imposed functional constraints as 

1 1 

X : max (11^) = 4 - pi + 3gi - 4 ^ Px^qyiPiixiyi - 8 Yl P^i%i9i\^iyi 

xiyi=0 xiyi=0 

1 1 

= 4 - pi + 3gi - 4 J2 PxiQyi^hii^yi) - 8 H Px^qy^qilxiyi 

xiyi=0 xiyi=0 

a;i=0 xij/i=0 

1 

= 4 - pi + 3gi - 4go - 8 ^ PxiqyiQiixiyi 

xiyi=0 
1 1 

Y : max (11^) = 4 + 3pi - gi - 8 ^ PxiqyiPi\xiyi - 4: J2 PxiQyiQilxiyi 

^^ xiyi=0 xiyi=0 

1 1 

= 4 + 3pi-gi-8 ^ p^igy,(5i,(i_j,,)-4 ^ Px^qy^q^x^y^ 

xiyi=Q xiyi=0 

1 1 

= 4 + 3pi-gi-8 ^ p^,go-4 ^ Pa;igj,igi|xiyi 

a;ij/i=0 



a;i=0 



4 + 3pi -gi -8go -4 ^ Pa;igj/igi|xij/i- 

a;ij/i=0 



(9.15) 



These expected payoffs are continuous multivariate functions dependent only on the first 
stage freely varying parameters pi and gi and the second stage independent parameters 
Q'i|oo, 9i|oi, ?i|io, 9i|ii 5 so the relevant gradient operator used by both players to analyze 
this particular probability space is 

d d d d d d 



dpi ' dqi ' (9gi|oo ' <9gi|oi ' <9gi|io ' <9gi|ii 
The resulting optimization conditions evaluated by each player are 



(9.16) 
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-1 + 8 2^ gj/i [qi\Qyi - gi|ij/i 
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7 + 4 ^ p^, (qi\xia - qi\x^i 
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dqi 

g(n^) 
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g(n^) 

'9gi|io 
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(9.17) 
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The last four conditions here ensure that Y maximizes their expected payoff by setting 
Qi\xiyi = on any history Xiyi. That is, Y chooses the second stage choice 1/2 = and 
never punishes X irrespective of X's first stage move. In turn, substituting these results 
into the second condition establishes that Y maximizes their expected payoff by setting 
gi = 1 giving yi = 1. That is, Y always invests their funds in stage one. Consequently, 
these results substituted into the first condition shows that X maximizes their payoff by 
setting pi = giving xi = and so free rides on their opponent's inability to punish 
them. The resulting equilibria choice of play is {xi,yi,X2,y2) = (0,1,0,0) generating 
expected payoffs of ((H^), (H^)) = (7,3). 

9.2.4 Expected payoff comparison 

Altogether, the various joint probability spaces as considered here which might be adopted 
by the players gives a table of expected payoff outcomes of 



((n^),(nO) 



pX 



B 



pX 



B \x2=l-yi 



' B ' B lj/2=l-a;i 



(4,4) (3,7) (9.18) 



(7,3) (6,6) 



making it evident that to maximize their payoff, both players must rationally elect to 
use joint probability space 'P^\x2=i~yi x ^b 1^2=1-2:1 ^^ preference to any of the alternate 
probability space considered here. That is, players X and Y will undertake to function- 
ally anti-correlate their second stage decision to the previous choice of their opponent, 
and thereby deny themselves a second stage choice during the game. Again, they do 
this knowing it to be the payoff maximizing choice of probability space (among the few 
examined here). 

The clear predictions of our analysis is that players of unbounded rationality will 
choose to not free ride on their neighbours and will punish free riders even at considerable 
cost to themselves. 
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Chapter 10 

The centipede game 



10.1 Introduction 



The centipede game was introduced by Rosenthal [2S]- A readily accessible treatment can 
be found in [82] . The centipede game is of interest due to the extreme discrepancy between 
experimentally observed play and the predictions of game theory — see the experimental 
investigations in [83] with discrepancies explained by allowing players to altruistically 
consider their opponent's payoffs, or by using learning approaches to explain observed 
discrepancies in a normal form centipede game [51]. More generally, the centipede game 
has had a prime role in arguments over the definitions of rationality, common knowledge 
of rationality, and backwards induction [851 IMl EH EHl IMl [90l E]. In part, this ongoing 
debate has led to the wider impugning of backwards induction [53 [511, 1221 ES], but see 
the defence of backwards induction in [94|. For an indication of the role of this game in 
the w'-^"" »«™^™^«. .^J .^«;.i „„;„„„„„ „„„ [nc] 
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Figure 10.1: A truncated centipede game decision tree over 6 stages where two players X 
and Y alternately choose to either play down (xi, yi = for 1 < i < 3) in which case the 
game stops, or play across (xi,yi = 1 for 1 < i < 3) so that either their opponent faces 
a similar choice or the game terminates in stage 6. 
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10.2 The centipede game 

The centipede game gains its peculiar name as it normally features two players playing 
over 100 turns so that, when drawn horizontally as in Fig. IIO-H the game tree takes the 
appearance of a centipede. Here, we truncate the game without loss of generality at only 
6 stages allowing a tractable analysis. In this truncated centipede game, each player X 
or Y must alternately elect to either play down (xj, ^j = for 1 < z < 3) in which case 
the game immediately terminates and players gain the respective payoffs shown, or play 
across {xi, r/i = 1 for 1 < z < 3) in which case either their opponent plays or the game 
terminates with the payoffs shown. When either player hands play to their opponent, 
they suffer a short term loss of potential payoff with the prospect of a long term gain. 
The interest in this game comes from the countervailing effects of these short term losses 
and long term gains which combine together to ensure that human players typically fail 
to follow the recommendations of game theory and yet significantly improve their payoffs 
by doing so. 

In fact, the centipede game has a unique subgame perfect equilibrium solution, which 
can be readily located by simply inspecting Fig. 110. H and applying backwards induction. 
In the last (far right) stage, Y can choose y^ = to obtain a payoff of 11^ = 6, or can 
choose ?/3 = 1 to obtain a payoff of 11^ = 5. Obviously, Y will prefer to play down with 
?/3 = in this final stage to maximize their payoff. Player X is well able to deduce this 
to conclude that if they choose 0:3 = 1 to play across in the second last stage then they 
will obtain a payoff of 11^ = 4 when Y subsequently plays down. In contrast, should X 
play down themselves by choosing X3 = 0, they will gain the improved payoff of 11^ = 5. 
Obviously, X will choose X3 = to preempt Y^s choice of 7/3 = 0. Exactly the same 
argument applies to F's choice in the fourth stage, to X's choice in the third stage, to 
y's choice in the second stage, and finally to X's choice in the first stage. That is, being 
able to deduce that Y will play down in the second stage by choosing j/i = to give X a 
payoff of 11"^ = 0, then player X will choose to maximize their payoff by preempting Y 
and playing down in the first stage through the choice Xi = to gain an improved payoff 
of n^ = 1. The associated payoff for Y is 11^ = 0. 

And here lies the conundrum. The sole conventionally mandated choice of play lies 
in the first player X choosing down at the first opportunity to gain a mere fraction of 
the potential payoff should they and their opponent play across a few times. Interest- 
ingly, most people playing this game will indeed ignore the conventionally sanctioned 
choice with both players typically playing across repeatedly to drastically improve their 
payoffs. Just as in the other games under consideration here, it seems intuitively obvi- 
ous to human players that adopting "non-rational" play will improve payoffs. However, 
conventional analysis has had trouble explaining these propensities. Here, we show that 
lifting implicit conventional bounds on rationality to allow players to take into account 
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alternate probability spaces easily produces game theoretic predictions in agreement with 
observation. 

Altogether, the payoffs to the players in the centipede game considered here are 



n^ = (1 - xi) + xi (yi [3(1 - X2) + X2 {2(1 - 1/2) + t/2 (5(1 - X3) + X3 [4(1 - y^) + 6^3])}]) 
n^ = xi (2(1 - y,) + yi [1(1 - X2) + X2 {4(1 - t/2) + y2 (3(1 - X3) + X3 [6(1 - ^3) + ^Vs])}]) 

As usual, players must then choose amongst their possible probability spaces V^ and 
P^ to optimize their payoffs. A first choice will be the examination of the conventionally 
mandated probability space, which we turn to now. 



10.2.1 The unconstrained space V^ x V 



B 



To replicate the standard conventional analysis (the backwards induction analysis above), 
both players X and Y together adopt a joint probability space V^ x V]^ in which every 
behavioural strategy on every history set is independent — see Figs. 110.11 The expected 
payoff optimization problem for each player Z G {X, Y} can be written 



Z : max (H^) 



Y^ P^^{xi,yi, X2, 1/2, 3:3, y3)U^ixi,yi, X2, 1/2, X3, y^) 

xi,yi,x2,y2,x:i,y3=0 
1 
Y, P^(Xi)P^(l/l[Xi)P^(x2[Xiyi)P^(y2|Xil/iX2) X (10.2) 

xi,yi,X2,y2,x:i,yi=Q 

xP^ {x3\xiyiX2y2)P^ iy3\xiyiX2y2X3)'il^ ixi,yi, X2, 1/2, 3:3, ys)- 

To simplify notation, we write P^ {x2\xiyi) — )■ Px2\xiyi, P^ {y2\xiyi) — ?■ qy2\xiyi and so on, 
and we take account of normalization conditions po|xi?/i+Pi|xi?/i = 1 and qo\xiyi+(li\xiyi = 1 
on all histories. 

Consequently, the expected payoff optimization problem becomes 

X : max (H^) = [1 - Pi] + 
pi.Pi|ii,Pi|iiii 

Pi{0+ 

gi|i (3 1 -pi|ii + 
Pi|ii{2 1 -gi|iii + 

' + 



Y : max (H^) 
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qi\ni (3 1 -piiiiii + 



Pllllll 



6 



Q'i|iiiii 



5gi|iiiii])})].(10.3) 



In these optimization problems, the players X and Y have respective independent proba- 
bility parameters of pi,j9i|ii,pi|iiii and gi|i, gi|iii, giimn all of which can vary freely over 
[0, 1]. Consequently, in the joint space V^ x V]^, each player optimizes using the gradient 
operator 

rl f) f) f) f) f) 

;io.4) 



dpi (9gi|i' dpi\u dqi\iu dpi\iiu 5gi|iiiii. 
as all other parameters disappear. The easiest way to complete the optimization is via 
backwards induction, so both players first evaluate the last stage choice of player Y via 

'9(n^) 

^ = -Pigi|iPi|iigi|iiiPi|iiii < 0, (10.5) 

Cgiliiiii 

which is either zero should any player have played down in any preceding stage in which 
case Y is indifferent to any choice in this final stage, or always negative so essentially Y 
plays down via giium = and 1/3 = 0. This result allows player X to optimize their 
choice in the second last stage via 



dpi\ii 



11 



-Pigi|iPi|iigi|iii < 0, (10.6) 



which again, leads to the setting piiim = and X3 = 0. A similar analysis proceeds 
backwards through all the stages to give the final solution, deducible by both players, 
of {xi,yi,X2,y2,X3,y3) = (0,0,0,0,0,0). This choice garners players the conventionally 
mandated payoffs of ((H^), (H^)) = (1,0). 

10.2.2 Isomorphically constrained spaces 

Naturally, players of unbounded rationality will not be content to merely examine the 
conventionally mandated joint probability space V^ x V]^ and will turn to consider 
alternative joint probability spaces. In each alternative space, isomorphic constraints 
alter game spaces and trees and thereby alter the subgame decompositions used in the 
conventional analysis to locate novel equilibria. We consider such alternatives now. 

As usual, there are an infinity of possible probability spaces that might be adopted 
by the players in the sequential centipede game, and we can here consider only a partial 
search of these possible spaces. We first suppose that the players restrict their attention 
to "Markovian" strategies in which the variable of a given stage is only conditioned on the 
outcome of the immediately preceding stage. The alternative — correlating variables in the 
given stage to the outcomes in every preceding stage — simply generates to many options 
without adding significantly to the analysis. Given this restriction, a moments reflection 
will make it obvious that there is little point in a player choosing to anti-correlate their 
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choice in a given stage to their opponent's previous choice. There opponent must have 
played across so an anti-correlation would simply force a move down and this merely 
duplicate the outcomes of the conventional analysis above. The same considerations 
make it immediately attractive to have players consider perfect correlations between the 
opponent's choices in the preceding stage and the current choices in the present stage as 
a previous choice of across then implies a current choice of across. We therefore suppose 
that players, in each stage after the first, can make their choices either independently or 
by correlation to their opponent's previous choice. 

These consideration leave four possible probability spaces to be enacted by player X, 
namely 



I B 1 2:3=2/2 

' B \x2=yi,X3=y2- 



B \x2=yi 

■>Xl 



10.7) 



Similarly, there are eight possible spaces to be enacted by player Y, namely 



Vl\ 

' B \yi=xi 

' B \y2=x2 

' B\y:i=X3 /-iQ 

' B \yi=xi,y2=X2 
pY 

-pY 

-pY 



B \yi=xi,y3=X3 

:Y\ 

B \y2=x2,y3=x3 

:Y\ 

B \yi=xi,y2=x2,y3=x3- 



Altogether, this makes 32 joint probability spaces that need be considered. We now turn 
to follow the players in their analysis of the outcomes from their joint adoption of all of 
these combinations of spaces. 

10.2.3 The space V^ \x2==yi,x3=y2 x /^b \yi=xi,y2=x2,y3^x3 

Given the joint probability space V^ \x2=yi,x3=y2 x 'PB\yi=xi,y2=x2,y3=x3 in which every 
variable after the first stage is isomorphically constrained to be perfectly correlated to 
the preceding choice by their opponent, we have the variable assignment reduces to 
Us = X3 = 1/2 = X2 = yi = Xi. Subsequently, the payoff functions for both players become 

n^ = (1 - xi) + xi {yi [3(1 - X2) + X2 {2(1 - 1/2) + y2 (5(1 - x^) + x^ [4(1 - y^) + 61/3])}]) 

= (1 — Xi) + Xi {xi [3(1 — Xi) + Xi {2(1 — Xi) + Xi (5(1 — Xi) + Xi [4(1 — Xi) + 6x1])}]) 

= 1 + bxi 

n^ = xi (2(1 - yi) + yi [1(1 - X2) + X2 {4(1 - 1/2) + y2 (3(1 - X3) + X3 [6(1 - 1/3) + 51/3])}]) 
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= xi (2(1 - xi) + xi [1(1 - xi) + xi {4(1 - xi) + xi (3(1 - xi) + Xi [6(1 - Xi) + 5xi])}]) . 
= 5x1. (10.9) 

Here, it is immediately evident that player X maximizes their payoff by setting xi = 1 
generating a sequence of play of {xi,X2,X3,yi,y2,y3) = (1,1,1,1,1,1) and payoffs of 

(n^n^) = (6,5). 

A similar result is obtained from optimizing the expected payoffs via 

1 

• ™j^^ \ / ~ Z-^ \^^)^yix:i0x2yi0y2X20xsy2^y3X3'-'- 

xi,yi,X2,y2,X3,y3=0 
1 
= J2 P^ixi)U^{xi,Xi,Xi,Xi,Xi,Xi) 
a;i=0 

= l + 5pi 

1 

^ •(-'-'■ / ~ Z^ -* \^l)^yixi0x2yidy2X2^X3y2dy3X3^^ 

xi,yi,x2,y2,x3,y3=0 

1 
= Yl P^{xi)Il^{xi,Xi,Xi,Xi,Xi,Xi) 
XI =0 

= 5pi. (10.10) 

Here, player Y has left themselves no choices in any stage. As a result, the optimization 
is completed by 

^-^ = 5>0, 10.11 

dpi 

so X sets pi = 1 to choose xi = 1 and plays across in stage 1. This choice is mimicked in 

every subsequent stage giving {xi,yi,X2,y2,X3,y3) = (1,1,1,1,1,1) to generate payoffs 

to the players of ((H^), (H^)) = (6,5). 



iU.Z.4 ihe space / b \x2^yi,X3^y2 ^ ' B\y2^X2,y3^X3 

In the joint probability space V^ \x2=yi,x3=y2 ^ 'PB\y2=x2,y3=x3 the variable assignment 
reduces to ys = x^ = y2 = X2 = yi so the payoff functions become 

H^ = (1 - xi) + xi (t/i [3(1 - X2) + X2 {2(1 - y2) + y2 (5(1 - x^) + x^ [4(1 - y^) + 6^/3])}]) 

= (1 - xi) + xi (yi [3(1 - yi) + yi {2(1 - y^) + yi (5(1 - yi) + t/i [4(1 - y,) + 6yi])}]) 

= 1 — xi + 6x11/1 

H^ = xi (2(1 - y,) + t/i [1(1 - X2) + X2 {4(1 - 1/2) + ^2 (3(1 - X3) + X3 [6(1 - t/3) + ^Vs])}]) 

= xi(2 + 3i/i). (10.12) 

These payoff functions establish that player Y maximizes their payoff by setting yi = 1 
while player X maximizes their income by setting xi = 1 generating a sequence of play 
of (xi, X2, X3, yi,y2, ys) = (1,1, 1, 1, 1, 1) and payoffs of (H^, H^) = (6, 5). 
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The expected payoff functions optimization task becomes 



X : max (H^) 
pi 



Y : max (n^^) 

9111 



xi,y\,X2,y2,X3,yz=0 
1 

^ P^ {xi)P^ {yi\xi)Ii^ {xi,yi, yi, yi, yi, yi) 

xiyi=0 

1 - Pi + 6pigi|i 
1 

a;i, 2/1,2:2, 2/2, 2:3,2/3=0 
1 

J2 P^ {xi)P^ {yi\xi)U^ {xi,yi, yi, yi, yi, yi) 

xiyi=0 



X 



Y 



= pi [2 + 3gi|i 
In this case, the optimization is completed by 

9(n^) 



(10.13) 



dpi 
dqi\i 



-l + 6gi|i 
3pi. 



(10.14) 



Essentially then, player Y notes their positive gradient and so sets gi|i = 1 to give 
1/1 = 1. In turn, player X deduces this and sets pi = 1 to give xi = 1. Together, in 
the joint probability space 'P§\x2=yi,x3=y2 x 'PB\y2=x2,y3=x3, the optimization generates the 
play choices (a:i,|/i,X2, 1/2,2:3, 1/3) = (1,1,1,1,1,1) to generate payoffs to the players of 

n^),(n^)) = (6,5). 



10.2.5 Expected payoff comparison across multiple probability 
spaces 

Similar analysis to that above can be applied to evaluate the expected payoffs in all the 
other combinations of joint probability spaces to give the payoff combination table 



((n^),(n^) 



' B\yi=x\,y2=x2,y3=x3 
' B \yi=xi,y3=X3 



pY 



B \y2=x2,y3=x3 

Y\ 

B \y3=x3 



-pYl 



-pY 



B \yi=xi,y2=X2 
' B \yi=xi 
' B \y2=x2 



-px 



B \x2=yi,X3=y2 



-px 



B I 23=2/2 



pX\ pX 

' B \x2=yi ' B 



(6,5) 


(6,5) 


(6,5) 


(6,5) 


(6,5) 


(6,5) 


(6,5) 


(6,5) 


(6,5) 


(6,5) 


(6,5) 


(6,5) 


(4,6) 


(4,6) 


(5,3) 


(4,6) 


(4,6) 


(5,3) 


(4,6) 


(4,6) 


(2,4) 


(4,6) 


(4,6) 


(2,4) 



(6,5) 
(6,5) 
(6,5) 
(6,5) 
(5,3) 
(5,3) 

(3,1) 
(1,0). 



(10.15) 
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Figure 10.2: The 32 distinct trees and equilibrium pathways (indicated by triangles) given 
that players X and Y adopt the probability spaces shown. Dots indicate successive deci- 
sion nodes, where nodes with a descending vertical line are independent decision points 
and nodes lacking a descending vertical line are isomorphically constrained to equal the 
immediately preceding decision. 



The equivalent trees and equilibrium pathways are shown in Fig. 110.21 Perusal of this 
table makes it clear that players do not optimize their payoffs by choosing the convention- 
ally mandated joint probability space. Rather, it is much more likely that Y will choose 
any probability space in which their last stage variable is isomorphically constrained. In 
turn, this alters the payoffs for player X in such a way as to render them indifferent to 
any choice of probability space. The net result will be that X will find themselves playing 
across in the first stage irrespective of which space they adopt. 

A more sophisticated analysis in a longer game would take into account end-game 
effects where players might express some preference for terminating the game slightly 
early. Such tendencies are similar to those seen in the finite iterated prisoner's dilemma 
game, and as there, are not likely to make it irrational for players to play across in the 
early stages of the centipede game. 

The extended analysis presented here produces game theoretic predictions in sub- 
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stantial accord with observed human play in the centipede game. As noted above, this 
agreement contrasts sharply with the manifest contradiction between the game theoretic 
predictions of conventional analysis and observed play tendencies. As such, we take these 
observations as evidence that humans naturally take account of isomorphic constraints 
in strategic play in game theory. 
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Chapter 11 



The Iterated Prisoner's Dilemma 



11.1 Introduction 

Conventional game analysis holds that it is rational for players in a finite iterated pris- 
oner's dilemma to adopt the noncooperative "all defect" as the optimal solution under 
common knowledge of rationality (CKR) even though human players are commonly ob- 
served to increase payoffs by irrationally adopting alternative strategies. There are many 
observations of this mismatch between theoretical prediction and observed behaviour 
[96| Wf\ [98| [99] . These mismatches have typically been explained by introducing be- 
havioral factors such as bounded rationality, incomplete information, and other innate 
tendencies promoting cooperative and altruistic behaviours. In particular, these sug- 
gestions include modifying definitions of rationality to include reciprocity, fairness and 
altruism or to otherwise bound rationality ^0U\ HOH |102l [I03l [JOl El [105], via mod- 
elling the evolution of cooperation |lU6t [77] , by taking account of incomplete information 
|1U7[ I108[ I109[ II lU] and uncertainty in the number of repeat stages |lllj . to bound the 
complexity of implementable strategies |112[ I113[ 1114] , to account for communication 
and coordination costs [115j, to incorporate reputation and experimentation effects |116] 
or secondary utility functions as in benevolence theory [25] or in moral discussions |117] , 
to include adaptive learning |118] or fuzzy logic |119] . or more directly, to employ com- 
prehensive constructions of normal form strategy tables incorporating belief strategies 
|120[ I12H 1122] . Interestingly, quantum correlations can be introduced to resolve the 
prisoner's dilemma |123] . 

11.2 The finite Iterated Prisoner's Dilemma 

In this chapter, we will examine the finite iterated prisoner's dilemma while using the 
strong isomorphic mappings of probability theory to construct our mixed and behavioural 
strategy game spaces. Our particular focus will be to examine whether cooperation is 
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Figure 11.1: A two stage game decision tree where two non- communicating players simul- 
taneously choose moves x„ oryn equal to "0" or "1" at stage n with respective probabilities 
P-^ {xn\Hn) and P^{yn\Hn) at every decision point. At the beginning of each stage, play- 
ers know the history sets H^ = {xi,yi, . . . ,Xn-i,yn-i} detailing the shared information 
known to both players of all choices to that stage (with Hi = {}). Players also know their 
cumulative payoffs (11'''-, 11^) to that point. 



rational in the finite iterated prisoner's dilemma. As usual, we assume our players are 
rational and of unbounded capacity, and that they have adopted common knowledge of 
rationality (CKR). An illustrative game tree depicting a two stage iterated prisoner's 
dilemma is shown in Fig. 111.11 

The finite iterated prisoner's dilemma is defined here over a finite number of A^ stages, 
where at each stage 1 < n < A^ two non-communicating players X and Y choose moves 
Xn and yn chosen to be either (cooperation) or 1 (defection). The payoffs gained in 
each stage are given by the payoff matrix 



X 



equivalent to the single stage payoff functions 



i'^x^T^y) 


Y 
1 




1 


(2,2) (0,3) 
(3,0) (1,1), 



;ii.i) 



''^xyxjij y-n 



A -j- Xyi '^yn 



"^yy-^n^ynj ^ '^•^n ~T~ Vn- 



:ii.2) 



For multiple stage games, total game payoffs of a finite A^ stage game are simply the sum 
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of single stage payoffs. The optimization problem for both players is then 

N 

X : max li^ {xi,yi, .. .Xn.Vn) = V (2 + x„ - 2|/„) 

n=l 



Y : max ^ {xx-,y\,. ■ .x^,y^) = V(2 - 2x„ + |/„). (11.3) 

yi,---,yN -^ 

n=l 

Each player desires to maximize their respective endgame payoffs by varying their re- 
spective move choices x„ and |/„ over every stage of the game. (The players know A^ in 
advance.) 

Yet more generally, players choose their moves probabilistically to prevent their op- 
ponent predicting and exploiting deterministic strategies. The players will then adopt 
the joint probability space V^ x V"^ , and so seek to maximize their respective expected 
payoffs 

1 

X : max (11^) = ^ P^^{xi,yi, . . . ,XN,yN)'n.^ {xi,yi, . . . ,XN,yN) 

xi...yN=0 

= Y: P^(a;i)P^(i/i)P^(a:2|i/2)P^(|/2|^2) • • • x 

xi...yN=0 

N 

. . . P''{xN\HN)P'^{yN\HN) Y{2 + Xn- 2yn) 

n=l 

N 1 

= 2N + Y: E P^(:rOP^(yi)P^(x2|//2)P^(y2|^2)...x 
yi...yn=0 

. . . P''{Xn\Hn)P^{yn\Hn){Xn - 2y„) 

1 

F : max (n^) = ^ P^^ixi,yi, . . . ,XN,yN)Tl^ ixi,yi, . . . ,XN,yN) 
= E P''{x,)P''{y,)P''{x^\H,)P^{y,\H,) . . . x 

XI... y 1^=0 

N 

. . . P''{xN\HN)P'^{yN\HN) E(2 - 2a:„ + |/„) 

n=l 

N 1 

= 2N + Y: E P^(:ri)P^(yi)P^(x2|//2)P^(y2|^2)...X 
yi---yn=0 

. . . P^(x„|i/„)P^(t/„|i/„)(l/n - 2a:„), (11.4) 

We have written P^ {zn\Hn) as the conditioned probability distribution at stage n that 
player Z chooses move z„ (either Xn or ?/„) given history Hn = {xi,yi, . . . ,Xn-i,yn-i} 
detailing the shared information known to both players of all choices to that stage (with 
Hi = {}). We further write P^{xi\Hi) = P^{xi) = pi, P^{yi\Hi) = P^{yi) = gi, 
P^{xn\Hn) = Pxr,\Hr, and P^{yn\H.a) = qy„\Hr,- The expected payoffs are obtained 
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by summing over every possible path through the game tree specified by the move 
choices Xi,yi, . . . ,XN,yN, with each path weighted by the joint probabihty of that 
path being selected P-^^ {xi^yi^ . . . ^xn^Vn), and where each path generates a payoff 
of n^(xi, 1/1, ... , Xn, Vn) for player Z. 

Here, as usual, the players X and Y vary their choice of respective probability space 
V^ and V^ so as to maximize their expected payoff. That is, we hold that such players 
will avail themselves of the strong isomorphic mappings adopted by probability theory to 
construct their mixed or behavioural strategy spaces. Hence, each player will sequentially 
analyze situations where both players adopt altered joint probability spaces Vf x Vj 
for i,j = 0, 1,2, . . .. The infinity of possible alternatives mandates that some limits be 
placed on the search space. 

In the following analysis, we will first consider the A^ = 1 single stage prisoner dilemma 
game. This will inform our subsequent analysis of the N = 2 stage prisoner's dilemma. 
We will analyze the N = 2 stage game by comparing three strategies commonly found in 
the literature — conventional independent play, a Tit-For-Tat strategy, and All Defect — 
with a functionally correlated Markovian probability strategy space. This analysis will 
then be generalized to consider a total of 256 alternate joint probability spaces. Finally, 
we will consider a multiple stage game with N arbitrary and analyze a number of alternate 
joint probability spaces. 

11.3 The A^ = 1 stage Prisoner's dilemma 

The single stage prisoner's dilemma has the players seeking to optimize the payoff func- 
tions 

X : max H"^(xi,|/i) = 2 + a;i — 2t/i 

r : max U^{xi,yi) = 2 - 2xi + yi. (11.5) 

We suppose that players adopt a joint behavioural probability space V§ x V^ . Because 
of the lack of communication, the choices of the Xi and yi variables are independent. 
One possible joint probability space defines the expected payoff optimization problem for 
each player as 



X:m^ax(H^) = ^ P''^{x,,yi)U''{x,,y,) 

xiyi=0 

= J2 P^(xi)P^(yi)(2 + xi-2i/i 

xtyi=0 

= 2+pi-2gi 
y:m^ax (H^) = fl P''''{xuyi)Il^ix^,y^) 

xiyi=0 
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xiyi=0 

= 2-2pi + gi, (11.6) 

where use has been made of the normahzation conditions po + pi = 1 and qo + qi = 1. 
In this two-player-single-stage game, each expected payoff function is a function of the 
independent parameters pi and gi and so are maximized by the gradient operator 

'_d_ _dj 
dpi ' dqi 



;ii.7) 



:ii.8i 



giving the joint optimization conditions 

dpi 

d(jn ^ ^ 

dqi 

Together, these make it evident that each player optimizes their expected payoff by 
maximizing their defection probability (choosing pi = 1 and gi = 1) irrespective of their 
opponent's choices. That is, both players defect with certainty. This is the unique single- 
stage Nash equilibrium point [3] from which neither player can unilaterally alter their 
choice without worsening their payoff. Even so, payoffs are jointly maximized when both 
players cooperate (via xi = yi = 0) to yield payoffs of (11^, 11^) = (2, 2). Herein lies the 
dilemma. 

We now turn to consider the N = 2 stage iterated prisoner's dilemma. 

11.4 The N = 2 stage prisoner's dilemma 

For the N = 2 stage game, the optimization problem for both players is 

2 

X:max U^ = ^^(2 + a;„ - 2y„) 

n=l 
2 

y:max U^ = ^(2 - 2x„ + ?/„). (11.9) 



yi,y2 



n=l 

The question which needs to be addressed by each player is how to take account of 
all of the possible functional relationships that might exist between the variables. Of 
course, when the variables are functionally related then this imposes constraints onto 
the calculation of gradients which effects optimization outcomes. Game theory presumes 
there exists a single space which properly takes into account every possible functional 
dependency. Probability theory and optimization theory in general hold that no such 
single space exists. These fields employ a multiplicity of distinct spaces in order to take 
account of the different possible dependencies. In what follows, we will consider a small 
number of different possible functional dependencies. 
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11.4.1 The unconstrained space V^ x V]^ 

Conventional game analysis assumes that rational players X and Y will adopt a single 
specific joint probability space, denoted here V^ x V]^. In this space, the absence of 
isomorphism constraints means that all behavioural strategies are independent allowing 
the game to be decomposed into subgames in every history separating the last stage from 
the preceding stage. Then, optimization in the last stage is independent of both prior 
and non-existent future events, so the last stage is identically a single stage game and 
optimized in the prisoner's dilemma via the unique single stage Nash equilibria of mutual 
defection. This process can then be iterated backwards through the game (backwards 
induction) to locate the unique Nash equilibria for the entire game of mutual defection 
in every stage. We now detail this analysis. 

In the space V^ x V]^, players seek to optimize their respective expected payoffs 



X :max (H^) 



2 1 

2^+E E P''{xi)P^{yi)...P''{xn\H^)P''{yn\Hn){xn-2y„ 

n=l xi...y„=0 

A + pi-2qi + [l- Pi] [1 - qi] pi|oo - 2gi|oo 



+ [1 - Pi] Qi 

+vi [1 - gi] [: 



Pl|01 

Pi|io - 



' 2gi|oi 
2gi|io 



PiQi 



Pi\ii - 2gi| 



11 



Y : max {W) 



2A^+E E P''{xi)P''{yi)...P''{xn\H^)P''{yn\Hn){yr, 

ra=l x\...yn=Q 

4 - 2pi + gi + [1 - pi] [1 - gi] [gi|oo - 2pi|oo 
+ [1 - Pi] Qi [giloi - 2pi|oi 
+Pi [1 - Qi] 9i|io - 2j9i|io 



ZiXr, 



PiQi 



(li\ii - 2pi| 



11 



ll.lO) 



These expected payoff functions can take account of every possible state of correla- 
tion between the second stage variables X2 and 1/2 and the first stage variables Xi and 
yi. The first stage probability variables Pi,qi, together with the second stage variables 
Pi|oo,Pi|oi,Pi|io,Pi|ii, and gi|oo, gi|oi, gi|io, gi|ii are all freely varying over the range [0, 1]. 
As a result, the relevant gradient operator used by both players to analyze this particular 
probability space is 



d d 



d 



d 



d 



d 



d 



d 



d 



d 



(11.11: 



dpi ' dqi ' (9pi|oo ' <9j9i|oi ' <9pi|io ' dpi\ii ' (9gi|oo ' <9gi|oi ' dqi\w ' dqi\ii_ 
Immediately then, optimization with respect to second stage variables by player X gives 



dpi\oo 
dpi\m 



[i-Pi][i-gi]>0 
[1 - Pi] gi > 
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9(n^) 



•9^1110 
dpi\u 



Pi [1 - gi] > 

Pigi>0, (11.12) 



with similar results applying for Y. As the rate of change of the expected payoff is 
essentially positive with increasing last stage defection probability, each player maximizes 
their expected payoff by defecting with certainty in the last stage. That is, each player 
sets Pi\xiyi = 1 and qi\xiyi = 1 on every pathway. Taking account of this last stage result 
simplifies the optimization for the first stage probability variables (backwards induction), 
giving 

^-^ = 1, (11.13) 

dpi 

with similar results applying for Y. Again, players will defect in the first stage by 
setting pi = 1 and gi = 1. Hence, players conclude that, given the adoption of the 
joint probability space V^ x P^, they maximize their expected payoffs by defecting in 
every stage of the game {xi,yi,X2,y2) = (1)1,1)1) to derive a joint expected payoff 
of ((n"^),(n^)j = (N,N) = (2,2). This is the unique Nash equilibrium pathway for 
the finite iterated prisoner's dilemma, given the adoption of the joint probability space 

11.4.2 Alternate isomorphic probability spaces 

In this section we suppose that players X and Y consider only a choice of four possible 
alternate probability spaces, namely, the conventional independent behavioural strat- 
egy space, a functionally correlated Markovian probability space, a Tit-For-Tat strategy 
space, and an All Defect strategy space. 

When adopting a Markovian space, each player functionally correlates their second 
stage choices to their opponent's first stage choices. That is, player X implements 

X2 = Vl 
Px2\xiyi — ^X2yii [^^■^^) 

while player Y chooses 

1/2 = Xi 

We denote these spaces respectively as 'P^\x2=yi and 'P]^\y^=xi- 

When adopting Tit-For-Tat, each player chooses to cooperate in the first stage and 
then functionally correlate their second stage choice to the opponent's first stage choice. 
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Player X implements Tit-For-Tat via 

xi = 

X2 = yi 

Px2\xiyi = ^X2yii (^ii.iOJ 

while player Y will implement 



yi 


= 


%i 


= 5yiO 


1/2 


= Xi 


%2\xiyi 


— ^y2Xi 



(11.17) 

We denote these probability spaces respectively as V^\xi=o,x2=yi ^^^ "^b I yi =0,2/2=^:1 • 

Finally, by adopting the ALL DEFECT space, each player chooses to defect in every 
stage. Player X chooses 

Xi = 1 

Pxi = 4ll 

X2 = 1 
Px2\xiyi = '^X2l5 (11.18) 

and player Y chooses 

yi = 1 
1/2 = 1 

%2\xiyi — Oy^i- (^ii.iyj 

We denote these probability spaces respectively as Vg\x^=x2=i and V^\y^=y2=i. 

Subsequently, players of unbounded rationality will then sequentially examine the 
alternate isomorphic probability spaces available to the players. Within each possible 
space, they will locate the constrained equilibria optimizing outcomes, and then later 
compare these outcomes in a comparison table. We complete this process now. 

11.4.3 N = 2 stage: Independent versus Markovian strategies 

Supposing that the players examine the case where they adopt Independent versus Marko- 
vian strategies and so jointly adopt the V^ x 'P]^\y2=xi probability space. In this space. 



11.4. THE N = 2 STAGE PRISONER'S DILEMMA 



131 



Hi={} 






H3: 




Figure 11.2: The case where players {X.,Y) adopt Independent versus Markovian strate- 
gies in the V^ x V]^\y2=xi joint probability space. The second stage choices of player 
Y are isomorphically constrained and so are not freely varying parameters and do not 
appear in the decision tree. 



the players seek to optimize f lll.lOp subject to the isomorphic constraint ?/2 = xi. This 
constraint alters the expected payoff optimization problems to be 



X : max (H^) 



F:max (W) 

91 



4 + pi - 2gi + 

[1-Pi] [l-gi]pi|oo + 

[1 - pi] qiPi\oi + 

Pi [1 - Qi] [pi\io - 2] + 

PiQi [pi\ii - 2 
4 - 2pi + gi + 

-2[1 -pi][l-qi]pi\oo 
2 [1 - Pi] qiPi\oi + 
Pi [1 - Qi] fl - 2pi| 



110 



+ 



PiQi 



2pi| 



11 



;ii.2o) 



These expected payoffs are continuous multivariate functions dependent only on the freely 
varying parameters [pi, gi,pi|oo,Pi|oi5Pi|i05Pi|ii]- Consequently, the relevant gradient op- 
erator used by both players to analyze this particular probability space is 



d d d 



d 



d 



d 



dpi ' dqi ' (9pi|oo ' 5pi|oi ' ^Pilio ' <9pi 



i|ii 



:ii.2n 
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while the resulting game tree is shown in Fig. 111.21 Optimization then proceeds as usual 



via 



dpi\oo 
dpi\oi 
dpi\io 

5piiii 



[i-Pi][i-gi]>0 
[1 - Pi] gi > 
Pi [1 - gi] > 
Pigi > 0, 



ensuring that player X defects with certainty in the last stage by setting Pi\xiyi 
every pathway. These choices then allow evaluating 

X : max (11^) = 5 - pi - 2qi 



;il.22) 



1 on 



pi 



dpi 



1 < 0, 



^11.23) 



so player X cooperates with certainty in the first stage by setting pi = 0. In contrast, 
the analysis by player Y must simply determine their first stage variable (taking account 
of the optimized moves by player X) via 



Y : max 

91 



dqi 



2 + 2gi 



1 >0, 



11.24) 



so player Y defects in the first stage by setting gi = 1. Altogether, when players 
{X,Y) adopt the V^ x 7^^1^2=2:1 joint probability space, they play the move combi- 



nations {xi,yi,X2,y2) = (0, 1, 1,0) to garner payoffs ((11"^), (n^)) = (3,3). Here, in this 
particular joint probability space, the player adopting an Independent strategy must co- 
operate in the first stage to ensure that their mimicking opponent playing a Markovian 
will cooperate in the second stage so setting them up for a sucker's payoff in that stage. 
However, this gains them little as their opponent can still freely defect in the first stage 
so in the end, players end up with equal payoffs. 



11.4.4 N = 2 stage: Independent versus All Defect strategies 

Suppose now that players examine the situation where they jointly adopt Independent 
versus All Defect strategies in the V^ x V]^\y-^=y^=i probability space. After resolution of 
the adopted isomorphic constraints, the expected payoff optimization problems become 



X : max (H^) 

Pl.PllOl.Pljll 

tY 



2 + pi + [l-pi] 



Pl|01 



+ Pi 



Pi\ii 



F:(H^) = 5-2pi + [l-pi] l-2pi| 



101 



+ Pi 



l-2pi|n . (11.25) 
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Figure 11.3: The case where players {X, Y) adopt Independent versus All Defect strategies 
in the V^ x V^\y-i^=y^=i joint probability space. We write Px2\xii -^ Px2\xi- Here, neither 
first nor second stage choices of player Y appear in the game tree as they have been 
isomorphically constrained. 



Given the isomorphic constraints adopted by the players, these expected payoff func- 
tions are dependent solely on the freely varying parameters [j5i,pi|oi,Pi|ii] so the relevant 
gradient operator used by both players in their analysis is 

d d d 



ppi dpi\Qi dpi\ii 
Consequently, optimization for player X gives 



;ii.26) 



dpi\oi 
dpi\ii 



[1-Pi]>0 
Pi>0, 



;ii.27) 



leading, essentially, to defection on all second stage histories via pii^ii = 1 and qi\x^i = 1 
on every pathway. Taking account of these last stage results then gives 



X : max (H^) 
pi 



1+Pi 



dpi 



'11.28) 



so player X also defects in the first stage with certainty through the choice pi = 1. 
Altogether, the V^ x V^\y-^^=y2=i joint probability space leads both players to mutual 
defection in every stage to garner expected payoffs of ((11"^), {Ti^n = {N, N) = (2, 2). 



11.4.5 N = 2 stage: Independent versus Tit-For-Tat strategies 

If, on the other hand, players (X, Y) suppose that together they adopt the V^ x 



V 



Y\ 

B \yi=0,y2- 



^xi joint probability space, then the expected payoff function optimization 
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Figure 11.4: The case where players (X, Y) adopt Independent versus Tit-For-Tat strate- 
gies in the V^ x V^\y^=Q^y^=x^ joint probability space. We write Px2|xi0 = 'Px2\xi- Again, 
neither first nor second stage choices of player Y appear in the game tree as they have 
been isomorphically constrained and so are not freely varying parameters. 



problem becomes 
X : 



max (n-^) 

Pl.Pl|00.Pl|10 

tY 



4 + P1 + [1 -pi]pi\oo+Pi 



Pl\10 



no = 4-2pi-2[l-pi]pi|oo+Pi l-2pi| 



110 



;ii.29) 



As such, the expected payoff functions are dependent only on the freely varying parame- 
ters [pi,Pi|oO)Pi|io] so the relevant gradient operator used by both players in their analysis 



IS 



d d 



d 



ppi (9pi|oo <9pi|ioj 
Consequently, optimization for player X gives 



;ii.3o) 



dPl\QO 

<9pi|io 



[1-Pi]>0 
Pi>0, 



leading, essentially, to defection on all second stage histories via Pi\xi0 
pathway. Taking account of these last stage results then gives 



(11.31) 
1 on every 



X : max (H^) 
pi 



dpi 



5-pi 
-1, 



;ii.32) 



so player X cooperates in the first stage with certainty through the choice pi = 0. 
Altogether, the V^ x V]^\y-^=o^y2=xi joint probability space leads players to the move 
combinations {xi,yi,X2,y2) = (0,0,1,0) to garner payoffs ((11"^), (II-^)) = (5,2). 
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Figure 11.5: The case where players {X, Y) adopt Markovian versus Markovian strategies 

=xi joint probability space. As both players functionally assign 



x\ 



in the V§ 



\x2=yi 



X V 



Y\ 
B\y2- 



their second stage choices, the only freely varying parameters are the first stage choices 
of each player. 

11.4.6 N = 2 stage: Markovian versus Markovian strategies 



X\ 



Suppose now that players (X, Y) jointly assume they both adopt the V^ 



\x2=yi 



xV 



Y\ 

B \y2=xi 



probability space. After resolution of the adopted isomorphic constraints, the expected 
payoff function optimization problems become 



X : max (11^) = 4 - pi - gi 
Y : max (II'*^) = 4 — pi — gi, 



^11. 33) 



which are dependent only on the freely varying parameters [^1,^1], so immediately, the 
gradient operator used by each player in their analysis is 



_d_ _d_ 
dp I ' dqi 

Optimization then proceeds straightforwardly giving respectively for each player 

•9(n^) 



;ii.34) 



dpi 
dqi 



-1, 



;il.35) 



ensuring that in this space, both players cooperate with certainty in the first stage by 
setting pi = qi = 0. Altogether, when players (X, Y) adopt the ^^^2=2/1 ^ "^s lj/2=a;i joint 
probability space, they cooperate via the move combinations (a;i,|/i,X2,i/2) = (0,0,0,0) 
to garner payoffs ( (11'''"), (II-*^)) = (4,4). That is, under a joint constraint where each 
player mimics their opponent's previous moves, a strategy of cooperation is rational as 
it maximizes expected payoffs for both players. 
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Figure 11.6: The case where players (X, Y) adopt Markovian versus All Defect strategies 
in the 'P^\x2=yi x "^51^,1=^2=1 joint probability space. As both players functionally assign 
all of their second stage choices while player Y defects with certainty in the first stage, 
the only freely varying parameter left is the first stage choice of player X reducing the 
game to being a single-player-single-stage situation as shown. 

11.4.7 N = 2 stage: Markovian versus All Defect strategies 

Suppose now that players (X, Y) analyze the case where they jointly adopt the V§\x2=yi x 
^_B ls/i=s/2=i probability space. The resolution of the adopted constraints means that the 
expected payoff function optimization problem for the players becomes 

X : max (H^) = I + pi 



pi 



n^ = 4-2pi, (11.36) 



which are dependent only on the freely varying parameter pi , so immediately, the gradient 
operator used by each player in their analysis is 

V = ^. (11.37) 

dpi 

Player X then evaluates 

9(n^) 

4-^ = 1, 11-38 

dpi 

ensuring that this player defects with certainty in the first stage by setting pi = 1. 

Altogether, when players {X,Y) jointly adopt the 'P§\x2=yi x 'P]^\y-^=y2=i probability 

space, they generate the optimal move combination {xi,yi,X2,y2) = (1, 1, 1, 1) to garner 

payoffs ((n^),(n^)) = (2,2). 

11.4.8 N = 2 stage: Markovian verses Tit-For-Tat strategies 



Suppose now that players (X, Y) jointly assume that together they adopt the 'P^\x2=yi x 
V^\y^=o^y2=xi probability space. After resolution of the isomorphic constraints, the ex- 
pected payoff function optimization problems become 

X : max (H^) = 4 - pi 
pi 
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Figure 11.7: The case where players (X, Y) adopt Markovian verses Tit-For-Tat strategies 
in the V'§\x2=yi ^ "^s 1 2/1=0,^2=^^1 joint probability space. As both players functionally assign 
all of their second stage choices while player Y cooperates with certainty in the first stage, 
the only freely varying parameter left is the first stage choice of player X reducing the 
game to being a single-player-single-stage situation as shown. 

(n^) = A-p^, (11.39) 

which are dependent only on the freely varying parameter pi, so immediately, the gradient 
operator used by each player in their analysis is 

V = ^. (11.40) 

dpi 



Player X then evaluates 

9(n^) 



-1, (11.41) 



dpi 

ensuring that this player cooperates with certainty in the first stage by setting pi = 0. 
Altogether, when players {X,Y) jointly adopt the 'P^\x2=yi x 'PB\yi=o,y2=xi probability 
space, they generate the optimal move combination {xi,yi,X2,y2) = (0,0,0,0) to garner 
payoflFs((n^),(n^)) = (4,4). 



11.4.9 N = 2 stage: Comparing payoffs 

The remainder of the possible probability spaces that the players might analyze, Tit- 
For-Tat versus Tit-For-Tat (V^ \x^=o,x2=yi ^ "^b l?/i=o,j/2=a;i)5 Tit-For-Tat versus All Defect 
{VB\xi=o,x2=yi X V}^\yi=y2=i), and All Defect versus All defect (Pf j^^^^^^i x V^\y^=y^=i), 
possess no free variables whatsoever and so merely involve an evaluation of the expected 
payoffs in each case. Altogether, under the assumption that either player might adopt 
any of the four probability spaces considered here, then players must compare 16 possible 
isomorphically constrained optima to locate their optimal choice of probability space. 
The comparison table showing every possible combination of adopted probability space 
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for either player is 



((n^),(n^)) 


1 B y2=xi 


vl 


1 B 


yi=0,y2=xi 


' B \yi=y2=i 


' B 2:2=1/1 


(4,4) 


(3,3) 




(4,4) 


(2,2) 


T>X 


(3,3) 


(2,2) 




(5,2) 


(2,2) 


x>x 

1 B xi=0,x2=yi 


(4,4) 


(2,5) 




(4,4) 


(1,4) 


jyX 

' Xl=X2 = l 


(2,2) 


(2,2) 




(4,1) 


(2,2). 



(11.42) 



This table of alternate expected payoffs makes it evident that the Tit-For-Tat and All 
Defect probability spaces are weakly dominated by the Markovian and Independent prob- 
ability spaces. Player's choices of optimal probability spaces then come down effectively 
to a comparison of the Markovian or the Independent probability spaces. Perusal of the 
table shows that adopting the Markovian probability space offers the better returns to 
either player. 

Given this admittedly small set of possible strategy constraints, rational players will 
maximize their expected payoffs by adopting a Markovian strategy and rationally co- 
operate in the finite iterated prisoner's dilemma. The traditional result of conventional 
game analysis that mutual all defection is the unique Nash equilibria for this game is an 
incomplete analysis based on the unjustified restriction that players can only employ a 
restricted set of independent probabihty distributions. 



11.4.10 N = 2 stage: Extended isomorphic constraints 

An immediate question of interest is whether the conclusion that cooperation is rational 
survives an extended analysis employing a wider class of possible isomorphic constraints 
which we investigate now. We here examine a total of 256 alternate probability spaces for 
the N = 2 stage iterated prisoner's dilemma game. The resulting game trees are shown 
in Fig. 111.81 (appearing in exploded form), with optimized expected payoffs derived in 
each joint probability space shown in Table lll.li 

We suppose that each of our players, denoted Z E {X,Y}, chooses whether each 
of their four second stage behavioural strategies P^ {z2\xiyi) are either independent, 
denoted "0", or perfectly correlated to their opponent's previous move, denoted "-I-". 
(Perfect anti-correlations are also possible, but these are not considered here.) There are 
four histories {xi,yi) G {(0, 0), (0, 1), (1, 0), (1, 1)}. Admittedly, it is unusual to specify 
whether a behavioural strategy implemented at a single node of a game tree is either 
independent of previous events or correlated with previous events. However, there is 
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Figure 11.8: The generated trees and the equilibrium pathways (indicated by small dots, 
with multiple dots indicating mixed equilibrium pathways) assuming that player X adopts 
the probability space shown on the vertical axis and that player Y adopts the probability 
space shown horizontally. (When the X2 choice is correlated and the y2 choice is indepen- 
dent, a vertical line is shown to maintain the relative spacings of each tree.) The expected 
payoffs under each strategy combination are shown in Table lTTl] 
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nothing preventing this from occurring — it might not be an optimal choice but it is a 
possible set of choices that a player might make when optimizing their payoffs over a 
game tree. 



Consequently, if player Z chooses to make all of the second stage behavioural strategy 
probability distributions P^ {z2\xiyi) independent then the adopted space is Vqqqq. This 
means that the randomized choices player Z makes at every second stage node of the 
game tree are independent of every other event (as is usually the case). However, if Z 
chooses to functionally correlate all of their second stage behavioural strategy probability 
distributions P^ {z2\xiyi) then the adopted space is 'P:f_|__|__,_. In this case, the dice roll that 
Z uses to make their choice of ?/2 in the case (xi,yi) = (0, 0) will be perfectly correlated 
to the previous event a;i = 0. As noted, this is an unusual choice but nevertheless 
it is still a possible choice. Intermediate cases include when, for instance, Z decides 
to make P'^(z2|00) and P^(z2|10) independent, and to functionally correlate P^{z2\Ql) 
and P^{z2\ll), in which case the adopted space is 'P(f|_o+5 and so on. Altogether, there 
are 16 possible choices that player Z might make about their probability space, namely 
{P^oo; ^(X)o+; ^(X)+o; '^(X)++; • • • ; '^ ++++)■ ^^ combination, both players can jointly adopt 
one of 16^ = 256 different joint probability spaces, in each of which a potentially different 
constrained equilibria exists, and all of these optima must be compared so that players 
can decide which probability space they can rationally choose. 



Here, without presenting the details of the calculations, we show the results of com- 
paring all 16 possible probability spaces of each player against all 16 of their opponent's 
possible probability spaces — see Fig. 111.81 and Table 111.11 (In cases where players are 
indifferent to move choice, we arbitrarily choose cooperation.) We also note that it turns 
out that there is only one isomorphically constrained equilibria in each probability space 
and some of these are in mixed strategies. 



It is of course possible to use Table 111.11 to locate globally optimal choices of proba- 
bility space. Examination of this table shows that many rows and columns are identical. 
Numbering each row from top to bottom by r^ and each column from left to right by Cj 
(1 < «, j < 16), we have n = r2 = r^ = rg, rg = r4 = ry = rg, rg = rio = rig = r^, 
and rii = ri2 = ris = rig. As well, we have ci = C2 = cs = C4, C5 = cg = C7 = cg, 
Cg = Cio = Cii = C12, and C13 = C14 = C15 = Cig. Removing all identical rows and columns 
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leaves the variational payoff table 
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An inspection by eye (checked by numerical calculation) confirms that the only "equi- 
libria" in this reduced table of constrained equilibria are the uninteresting combina- 
tions in the bottom right of (Ptooo>^o^+++)> (^o+++>^oooo)) and (p^++,Po^++), and 
the more interesting payoff maximizing equilibria in the top left of (^++++,^++++), 

(^++++,^+000) > (^+ooo>^++++)' and (Pfooo^^+ooo)- In these latter equilibria, as long 
as players functionally correlate their behavioural strategies in the second stage follow- 
ing from the history {xi,yi) = (0,0), then they will conclude that it is payoff maxi- 
mizing to cooperate in this finite iterated prisoner's dilemma to garner joint payoffs of 
( (n^), (n^)j = (4,4). Any other choice is not rational. 

Again, we conclude that players of unrestricted rationality will cooperate in the finite 
iterated prisoner's dilemma. As such, our analysis reconciles game theoretic predictions 
and the cooperative human behaviours observed in experimental tests [96l |97] . 



11.5 N > 2 stages: A limited investigation 



We now consider the case where the number of stages is known and finite and greater 
than two. We will consider how players might vary their choice of probability space or of 
isomorphic constraints so as to optimize the expected payoffs of Eq. I11.4l when the number 
of stages N > 2. Our analysis will be limited as with each additional stage the number 
of possible joint probability spaces that might be considered by the players increases 
exponentially. In the present section, we suppose that players adopt either a conventional 
independent behavioural space or a Markovian space in which current stage choices are 
correlated to the immediately preceding stage choices. In more detail, the choices open 
to the players include adopting either a conventional independent behavioural strategy 



n — Vn—l 



and V]^\y„ 



In 



space V§ and V^, or a Markovian probability space V^\x. 

subsequent sections, we will examine the various combinations of probability space that 

might be adopted, and we will finally allow players to try preemptive defection near 
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the terminal stages of the game. This will allow us to check whether these defections 
propagate backwards as required by a standard backwards induction analysis. 

11.5.1 A^ > 2: Independent strategies 

We first presume that players X and Y each examine the case where they jointly adopt 
the space V^ x V^ in which all of their behavioural strategies on every possible history 
set are independent of any other event. The players seek to optimize their respective 
expected payoff functions in Eq. 111.41 

Every behavioural probability parameter (after normalization) is independent so the 
gradient operator used by both players to analyze optimal play are 

d d d d d d 



dp^iiy dp^{iy dP^{i\Hiy dP'^iilHiY'"' dP^{i\HNy dP^'iilHj,) 

.1.44) 

where gradients are taken with respect to all possible history sets iJ„. Also, gradients are 
taken via total derivatives rather than partial derivatives to facilitate calculations — the 
normalization constraint P"^(0|if„) = 1— P^(l|i7„) allows writing the total rate of change 
of the expected payoff function with respect to the changing probability parameters as 



dP^{l\H^) dP^il\H^) dP^io\H^y 

Each player can then straightforwardly use this gradient operator defined within the 
joint probability space V^ x V]^ to evaluate their optimal choices. In particular, the 
shorthand notation Hn = {Hn-i, x„, ?/„} and some algebra allows writing the optimization 
conditions for player X as the set of simultaneous equations 



dp^{l) 
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1 + 
1 
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1 1 

Y, P^{yN~i\HN-i) Y {xN-^yN)^ 

yiv_i=0 XMyN=0 

:>X/^ ITT 1 „. \r>Y 



dP^niHN] 



[P {xN\HN_i,l,yN_i)P {yN\HN^i,l,yN_i)- 
P^'ixNlHM-i, 0, yN-i)P''iyN\HN-i, 0, yN-i) ] 
1. (11.46) 
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The equivalent simultaneous optimization conditions for player Py are 

dP^{i) ~ ••• 
d{W) 

dP^{l\HN-l) 



1 + 

J2 P^(a:i)P^(l/i) . ..P''{xN^2\HN-2)P'^{yN-2\HN-2) X 

x-i...xj^_2=0 
J/l---S/JV-2=0 

1 1 

^ P^{xn-i\Hn-i) Y. (l/iv-2x7v)x 

[ P^{xn\Hn^i, xn-1, l)P^{yN\HN-i, xn-i, 1) - 

P^{xN\HN-l,XN-l,0)P^{yN\HN-l,XN-l,0) ] 



d{W) 



dP^{l\H^ 



;ii.47) 



N 



Subsequently each player solves their respective sets of simultaneous equations to maxi- 
mize their expected payoff in the joint probability space V'^ x Vjj by setting P-^{1\Hn) = 
1 and P^{l\Hiq) = 1 for all history sets H^, and by setting P^{l\H]\f_i) = 1 and 
P^(l|ifAr_i) = 1 for all history sets Hn_i, and so on. The final result is that both 
players defect at every stage giving optimal choices as (x„,|/„) = (1, 1) = {D,D) for all 
n. At this point, payoffs are ((H^), {Ub)) = iN,N). 

11.5.2 N >2: Markovian versus Independent spaces 

Suppose now that players X and Y jointly examine the case where Y adopts the indepen- 
dent probability space while X adopts isomorphic constraints to implement Markovian 
play. In this case the joint probability space is "P^ Un=?/n-i x V^. Here, X adopts the 
isomorphic constraints 

Xn i/n— 1 

P^(x„|ff„) = (5.„,„_„ (11.48) 

for 2 < n < N and on every history H^. As usual, these isomorphic constraints must 
be resolved before the optimization can proceed rendering the optimization problem for 
each player as 



X : max (H^) = 2A^ + 
p^(i) 



E ^ (^i)^i 
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N-l 1 
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E E P^(:ri)P^(yi)...P^(|/„|if:K + 

n=l xiyi...yn=0 
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-2 Y. P''{xi)P''{yi)...P''{yN\H'^)yN, 

3:iyi---yN=0 
1 



Y : max (H^) 

PY(l),pY(l\Hr^) 



2N -2 



E p''ixi)xi 



a;i=0 



+ 



N-1 



(11.49) 



n=l a;i?/i...j/„=0 

+ E P''{xi)P''iyi)...P''{yN\K)y^. 

xiyi---yN=0 

Here, a modified history set appears due to the delta- function constraints so that, for 
instance, H^ = {xi,yi,X2,y2} = {xi,yi,yi,y2}- Hereinafter, primes are dropped. 

The shorthand notation iJ„ = {iJ„_i,y„} for n > 2 and some algebra allows writing 
the optimization conditions for player Y as the set of simultaneous equations 

rf(H^) 
rfP^(l) ~ ■■■' 



d{W) 



dP^{l\HM-l] 



d{W) 



-1 + 



+ E P"" {xi)P'' {yi) . . . P^(y^-2|^^-2) X 
xiyi---yN-2=o 
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1. 



;ii.5o) 



dP{l\HN) 

Hence, player Y optimizes their payoff by setting P^ {1\Hn) = 1 for every history set if at, 
and by setting P^{l\H]\f^i) = for every history set H^^i, and eventually by setting 
P^(l|-ff„) = for 1 < n < {N — 1). That is, Y maximizes their expected payoff by 
cooperating in every stage but the last. 

Player X is well able to calculate the same optimal choices for their opponent, and 
uses this knowledge to simplify their own optimization problem to eventually give the 
condition 

dPW)^ ^ ^ ^ 

Consequently, X optimizes their expected payoff by setting P'''"(l) = 1 and so defects in 
this first stage. 

In the joint probability space V^\x„=y„_i x V]^, the players locate the constrained 
equilibria at the point {xi, yi, . . . , y^) = (1, 0, . . . , 0, 1) generating the play sequence 



{xn,yn) = (1,0), (0,0),..., (0,0), (0,1) 
= {D,C){C,C)...{C,C){C,D), 



:il.52) 
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to give expected payoffs ((n^), (n^)) = (2A^ - l,2N - 1). Here, X defects in the first 
stage as tlieir opponent cannot respond without decreasing their payoff, while Y can 
defect in the last stage when X can no longer respond. 

11.5.3 A^ > 2: Markovian versus Markovian strategies 

Each player might well then analyze the case where both players adopt Markovian strate- 
gies and thereby implement the joint probability space 'P^\x„=y„_i x V]^\y„=x„_i- Here, 
X adopts the isomorphic constraints 

•^n Vn—l 

P^(a;„|/J„) = 6x^y^_„ (11.53) 

for 2 < n < N and every history set -ff„, while Y adopts the isomorphic constraints 

P'^iVnlBn) = Sy„x„_„ (11.54) 

for 2 < n < N and every history set iJ„. These constraints must be resolved before the 
optimization can proceed reducing the optimization problem for each player to 

X:niax(H^) = ^ P^(xi)P^(t/i)H^(a;i,yi), 
y:max (H^) = ^ P"" {x^)P^{y,)U^{x^,y^), (11.55) 



where the payoffs for a given play sequence {xi,yi) are 
^""{xuVi) = 



2N - ^xi - f t/i, N even. 



2N-^xi-^yi, N odd, 
2N - fail - f 1/1, A^ even. 



H"(xi,|/i) = { (11.56) 

2N - ^xi - ^yi, N odd. 

The adoption of the joint probability space 'P§\x„=y„_i x 'P]^\y^=x„_i has effectively re- 
duced the N stage supergame to a single stage game with variables Xi and yi and payoff 
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and for odd A^ of 



Y 



X tjY\ 



(n^,n 



X 



c 

D 



C 



D 



{2N,2N) l[N-l,N+l] 



'11.58) 



l[N+l,N-l] iN,N). 

That is, in the joint probabihty space V^\x„=y„_i x 'P]^\y„=x„^i, the normal form game 
(and equivalent game tree) is described by an effective payoff matrix with altered off- 
diagonal elements which naturally modify equilibria. 

As usual, the constrained equilibria in the joint space 'P^\x„=y„_i x 'P]^\y„=x„_i are 
now located via 

N even, 






N 
' 2 ' 



|(A^-3), iVodd, 



d{W) 
dP^il] 



' 2 ' 



A^ even. 



11.59) 



-i(A^-3), A^odd. 



Thus, for either A^ even or for A^ odd and greater than 3 we have the equilibrium points 
P^(l) = and P^(l) = or ixi,yi) = (0,0) = {C,C). Alternatively, for A^ = 1 the 
equilibria is P^(l) = 1 and P^(l) = 1 or {xi,yi) = (1, 1) = {D,D). When A^ = 3 these 
conditions are satisfied for any values of {xi,yi) requiring examination of actual payoffs 
motivating the selection {xi,yi) = (0,0) = {C,C). The generated sequences of play are 



;il.60) 
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(2A^,2Ar). 



11.5.4 N > 2: Comparing payoffs 

Each player must then compare the payoffs they expect given that together they jointly 
adopt the probability space combinations examined above. A table of all possible out- 
comes for an A^ > 2 stage game given the probability spaces under consideration takes 
the form 



;ii.6i) 
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This table makes it clear that in all the games considered here with two or more stages, 
players of unbounded rationality maximize their payoffs by each adopting the joint prob- 
ability space "Pb U„=j/„_i X '^Bl?/„=rc„_i in which they adopt isomorphic constraints to 
correlate all of their choices in every stage after the first with their opponents. Once each 
player has adopted this particular probability space, this means that they have adopted a 
"roulette" randomization device which allows them no further choices in any stage after 
the first, and they have done this as it maximizes their expected payoff. 

As in the N = 2 stage game, we conclude that while players of bounded rationality 
implementing a conventional analysis will defect in the multiple stage game, players of 
unrestricted rationality will cooperate in the finite iterated prisoner's dilemma. Again, 
our analysis is consistent with observed human behaviours j96| |97] . 

11.5.5 N > 2: Endgame analysis 

The simplified analysis of the previous section does not allow consideration of "endgame" 
strategies where players seek to defect in the final stages of a multiple stage game to 
preempt the defection of their opponent. It is these preemptive defections in backwards 
induction which conventionally require players of bounded rationality to defect in every 
stage of the finite iterated prisoner's dilemma. The question now is, does such mutual 
preemption apply in an unbounded rational analysis where players consider a wider range 
of possible alternate probability spaces. To this end, we suppose that player X adopts 
a probability space V^ where they functionally correlate their moves for stage 2 < n < 
{N — k) to their opponent's previous choices via 

•^n Vn—l 

P^(x„|if„) = 6,„y„_„ (11.62) 

ioT 2 < n < N — k and for every history H^, but chooses to make their choices in subse- 
quent stages independently so that ioi {N — k + 1) < n < N, all distributions P^ {xn\Hn) 
for all histories Hn represent independent behavioural random variables. Similarly, we 
suppose that player Y adopts a probability space Vj where they functionally correlate 
their moves for stage 2 <n < {N — j) to their opponent's previous choices N via 

Vn -^n— 1 

P'^iVnlBn) = Sy,^,„_„ (11.63) 

for 2 < n < N — j and for every history Hn, but chooses to make their choices in 
subsequent stages independently so that for (A^ — j + 1) < n < N, all distributions 
P^ {yn\Hn) for all histories Hn represent independent behavioural random variables. 

For either player, the probability space V^ subsumes a number of other possible 
probability spaces of interest. For instance, setting either k = N — 1 or k = N makes 
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all of player Z's behavioural variables throughout the entire game independent, so V^ = 
V^_^i = Vq. More interestingly, this probability space subsumes certain deterministic 
alternatives. To see this, suppose that player Z considers a probability space enforcing 
defection with certainty in the last k stages. However, it is not difficult to see that this 
probability space is weakly dominated by space V^ — this latter space allows players to 
either defect whenever that is payoff maximizing so they will do as well as defecting 
with certainty, or to cooperate whenever that is payoff maximizing so they will do as 
well as cooperating with certainty. That is, the motivation to preemptively defect in the 
endgame for a larger payoff is taken into account when considering the probability space 
V^ . Exactly similar considerations establish that V^ weakly dominates spaces enforcing 
a deterministic play of Tit-For-Tat which specify cooperation in the first stage. 

We now suppose that players X and Y together adopt the joint probability spaces 
V^ X Vj to examine rational choices for the cessation of cooperative play and the onset 
of preemptive defections. In this particular joint probability space, the optimization 
problem for each player becomes 

X : max (YiL) = 

Y: P''{x,)P^{y,)P''{xM-k+i\H'^.k+i)P''{yN-,+i\H'N-,+i) X . . . 

^l'^]V-fe+l'---'^]V=0 

yi,yN-j+i,--yN=^ 

... X P-^ {xn\H'j^)P^ {yN\H'j^)Il^j{xi,XN^k+i, • • • , Xn, yi, yN-j+i, • • • , 1/Af) 

Y : max {UL) = (11.64) 

?/i.?/iv-j+i.---?/iv=0 

... X P^ {xn\H'j^)P^ {yN\H'j^)I[lj{xi, XN_k+i, ■■■,xn, yi, yN-j+i, ■■■Vn), 

where again, care must be taken in writing the delta-function modified history sets H'^. 
In this equation, the attained payoffs for any given play 
sequence {xi,xj^f_k+i, ■ ■ ■ ,XN,yi,yN-j+i ■ ■ ■ jUn), assuming for simplicity that A^ > 3, 
are variously: 



1 < k < {N — l),j = : independent variables: Xi,xiy_k+i, ■ ■ ■ ,XN,yi (11.65) 

2N + ^xi + ^^1/1 - En=N-k+i Xn + XM, {N - k) eveu 

2N + ^4^X1 + ^^yi - El"^-fe+l Xn + XN, {N - k) odd. 
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2N + ^xi + ^±1^1/1 - En=N-k+i ^n - 2xN, {N - k) even 



n 



2 -■- ' 2 

2/Y _|_ A;-1-Af^ I S+fc-TV 



2 Xi + ^i^^l/l - T.n=N-k+l Xn " 2xn, {N - k) odd. 



'^ ^ k < {N ~ l),j = {N — 1) : independent variables: xi,XN-k+i^ ■ ■ ■ , xn, Hi, ■ ■ ■ ,yN 



nj = 2N + XI 



N 



N 



N-k-1 



n=N-k+l 



n=l 



n=N-k 



n 



2N - 2x1 - 2 



N 

E 

n=7V-fc+l 



Xr. 



N-k-1 

E 



N 



yn+ E Vr^ 



n=l n=N—k 



k = j,l < k < {N — 1) : independent variables: xi,XN-k+i, ■ ■ ■ ,XN,yi, VN-k+i, ■ ■ ■ ,yN 



2N + ^ Xi + 2^/1+ ^n=N-k+l ^n 2 2ln=N-k+l V-n 



{N — k) even 



n 



2N + 



3+k-N ^ I k-3-N, 



2 -Xl + '^^Y^yi + T.n=N-k+l a^n - 2 En=7V-fc+l Vn, {N - /c) odd 



n 



2N + ^xi + ^yi - 2 E^=7v-fc+i xn + En=N-k+i Vn, {N - k) even 

2N + ^^Xi + ^±^1/1 - 2 Y.n=N-k+l ^n + Et;v-fc+l 1/n, {N - k) odd. 



/c > j, 1 < /c, j < (A^ — 1) : independent variables: Xi, XAr„A:+i, . . . , Xn, Hi, Vn-j+i, ■ ■ ■ ,yN 






2N + 



2N 



^xi 



fe-l-Af 



-7/1 - V^-^-1 T + V^ 



:Ar_j Xn 2 2^n=N-j+l Vn-, 

{N — k) even 



2 yi l^n=N-k+l -^n T l^n=N-j -^n ^ l^n=N-j+l ilni 

{N - k) odd 



11.5. N >2 STAGES: A LIMITED INVESTIGATION 151 

k-N ^ I 2+k-N„, Y^Af-j-1 ^ O Y^JV ^ i Y^A^ 



K 



OAT I fe-^V, I ^+fe-^'^ 7, v"-J--^ T, -2V^^ r +V^^ ?/ 

ZiV -t- 2 "''I T^ 2 yi 2^n=N-k+l-^n '^ 2^n=N-j -^n ^ 2^n=N-j+l yn, 

{N — k) even 

OAT I fc-l-AT I 3+k-N ^N-j-1 _9V^ T +V^ 7/ 

{N - k) odd. 



The respective constrained equilibria with the optimized payoffs as shown in Table 
111.21 for all combinations of k and j . Every listed payoff pair in Table 111.21 is an iso- 
morphically constrained equilibrium point optimizing payoffs given imposed constraints. 
As noted previously, there is no generally accepted method to choose between alternate 
equilibria. However, it is tempting to use the rules of game theory to try to select an 
optimal choice of play. In Table 111.21 each alternate probability space becomes a strategy 
choice, and each equilibrium point becomes a pair of payoffs. Standard techniques can 
then be applied to determine global equilibria among the located constrained equilibria. 
However, we note that in general we have to take care to deal with multiple equilib- 
ria generated by particular joint probability spaces. By applying the Nash equilibrium 
definition to Table lll.2[ we obtain global equilibria at V^ x Vj for either k = and 
3 < j < (iV - 2), or j = and 3 < fc < (A^ - 2). 

These global equilibria can be considered rational for the iterated prisoner's dilemma 
in this restricted class of joint probability spaces, and there is no established way to select 
a particular one among these. The more important feature given from this analysis is 
that cooperation still naturally arises from these equilibria. The pathways produced by 
these equilibria are dominated by cooperation apart from some different choices at the last 
stage. This cooperative behaviour results when players of unbounded rationality examine 
alterative probability spaces to optimize their payoffs, in contrast to the conventionally 
mandated analysis wherein players are able to examine only a single probability space 
and are thus of bounded rationality. 
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Table 11.2: A partial listing of isomorphic equilibria when players X and Y jointly adopt 



the probability space V^ x Vj . In this space, X functionally correlates their moves for 
stage 2 < n < {N — k) to their opponent's previous choices but adopts independent 
behavioural strategies in stages (A^ — k + 1) to N, while player Y functionally correlates 
their moves for stage 2 < n < {N — j) to their opponent's previous choices but adopts 
independent behavioural strategies in stages (A^ — j ' + 1) to A^. Here, every shown payoff 
pair is a isomorphic equilibrium point making selection of a single best payoff maximiza- 
tion strategy difficult. Fractions indicate alternate equilibria with distinct payoffs shown 
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Conclusion 



12.1 The foundations of strategic analysis 

Strategic game analysis begins by defining the set of players 

/ = {l,2,...,n} (12.1) 

with n > 2. The choice n = 1 corresponds to decision theory. This immediately begs 
the question as to whether n is fixed or variable, and what effect this might have on the 
structure of the game analysis space. The number of players n acts as an index denoting 
the size of all subsequent spaces, and n would normally be considered as a constant taking 
different values. Suppose however, that a player wanted to construct a single space which 
"contained" all the possible spaces defined by each value of n. Would this single space 
adopt isomorphic mappings or allow uncertainty in the number of players to influence 
strategic decisions? 

Subsequently, each player i has a set of pure strategies Si = {l,2,...,mj} which 
combine together to give a set of pure strategy profiles S* = 5*1 x . . . x S'„. It is com- 
monly assumed that an unconstrained rational player must consider every one of their 
moves with some (possibly infinitesimal) probability and thus that the structure of the 
strategy set specifies the structure of the game. In contrast, we have shown that differ- 
ent probability spaces can be applied to the set of all possible strategies. Hence, it is a 
mistake to assume that the dimensionality of the strategy set somehow determines the 
dimensionality of the game space. 

A payoff function 11 : S* — ?■ 5R" with n(s) = [ni(s), . . . , n„(s)] then defines the payoff 
that player i receives when strategy profile s E S is played. Subsequently, a player i's 
mixed strategy is defined as a probability distribution over the pure strategy set Si to 
locate a point in an {rrii — 1)— dimensional standard simplex 

mi 

A, = <( a;, e i?'"' : Vj = 1 . . .m, : Xi, > : ^x,,- = 1 ^ (12.2; 

i=i 
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The mixed strategy profile is then a vector x = {xi, . . . , a;„}. The mixed strategy space 
is a multi-simplex A = Ai x . . . x A„. This simplex is held to be "complete" and to 
contain every possible probability distribution that might describe a game. It certainly 
contains every possible value of every possible probability distribution, but optimization 
requires it to contain every possible value and gradient of each probability distribution 
at a minimum. (Situations requiring greater generality could well be envisaged.) 

Finally, following Von Neumann and Morgenstern, it is universally held that every 
player's randomizations are independent and hence that there are no constraints acting 
on the probability distributions of the mixed strategy space. Thus, the probability of a 
pure strategy profile s given x is 

n 

xis) = l[x,s^ (12.3) 

and the expected payoff to player i is 

u,{x) = Y.Xi{s)Ui{s). (12.4) 

This payoff definition acts to limit the scope of possible games considered in game theory. 
There is no reason why games have to be restricted to consider only poly-linear expected 
payoff functions, and we argue here that these restrictions have limited the ability to 
analyze games. Payoffs can be assigned to players based on the probability distributions 
that they adopt, or on the gradients of the adopted probability distributions, or on their 
ability to maximize entropy or uncertainty or mutual information or Fisher information. 
Game probability distributions can be actualized by having players adjust the probability 
of light transmission through painted glass, or by altering the placement and number of 
pins effecting the fall of balls or of water streams. More mundanely, players can instruct 
agents allowing referees to repeat games many times to deduce adopted probability dis- 
tributions to assign payoffs. Further, in the absence of a complete theory of games, we 
simply do not know if players of unbounded rationality would optimize their outcomes 
by calculating the Fisher Information of a game, or by maximizing the Log Likelihood 
function. No limits should be placed on rationality in formulating a complete theory of 
games. 

Present practice in game theory discards isomorphism constraints allowing the mixed 
strategy space to take the form of a compact convex polyhedron in which expected payoff 
functions are quasiconcave and continuous polylinear functions of the mixed strategies 
of each player. This, in turn, allows the use of fixed point theorems to locate Nash 
equilibria, points at which no player can unilaterally improve their expected payoffs 
by changing their mixed strategy [21 [3]. However, no rationale has ever been offered 
for why the tangent spaces of the embedded source probability distributions need to 
be overwritten. That is, the strength of the isomorphisms underlying the construction 
of mixed strategy spaces has never been considered. Whenever analysis is transferred 
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from one space to another, then the strength of the isomorphism underlying the transfer 
mapping must be estabhshed. Von Neumann did precisely this when he provided the 
mathematical foundations of quantum mechanics. In its early stages, quantum mechanics 
appeared in two seemingly distinct forms, matrix mechanics and wave mechanics. Von 
Nuemann unified these approaches by establishing an exact isomorphism between the 
space of states in matrix mechanics and the space of wave functions including all relevant 
derivatives using theorems from functional analysis |124j . From that point on, the proven 
existence of this isomorphic mapping allowed quantum analysis to use either matrix or 
wave approaches as desired. In game theory, the strength of the isomorphic mapping 
underlying the embedding of probability spaces within mixed strategy spaces has not yet 
been established. 

If, following probability theory, the original tangent spaces of the source probability 
distributions describing a game are retained within the mixed strategy space, then this 
impacts on the boundaries, shape, dimensionality, and geometry of the mixed strategy 
space. In turn, this alters the strategic analysis. For example, different tangent spaces 
can change the convexity and polylinearity properties of expected payoff functions — 
one tangent space might ensure expected payoff functions are convex and polylinear so 
established existence theorems can define Nash equilibria, while a different tangent space 
might support nonconvex and non-polylinear expected payoff functions. In such spaces 
established existence theorems cannot be used to define Nash equilibria. 

Probability theory models two perfectly correlated variables as necessarily possessing 
perfectly correlated trembles, and accomplishes this by using a one- dimensional tan- 
gent space. In contrast, in the mixed strategy space two perfectly correlated variables 
can exhibit independent trembles because the mixed strategy tangent space permits this. 
Similarly, probability theory models independent variables as necessarily possessing inde- 
pendent trembles in a two dimensional tangent space. In contrast, independent variables 
in the mixed strategy space must exhibit correlated trembles if they are to remain inde- 
pendent in the enlarged tangent space of the mixed strategy space. (They must fluctuate 
together to maintain the separability of their joint distribution.) The different tangent 
spaces adopted by probability theory and game theory impact on which probabilities can 
be trembled and on the possibility of equilibrium refinements. As trembles are the differ- 
ential variations of probability parameters within the adopted tangent space, so different 
tangent spaces modify both possible trembles and defined gradient operators. Altering 
the differential fluctuations and gradients of a probability space correspond to altering 
which moves can occur at each stage of a game and even of the number of stages in a 
game. In turn, these altered move trees impact on the implementation of optimization 
algorithms such as "backwards induction". In general, the adopted tangent space un- 
derlies all optimization algorithms in both game and probability theory. Game theory 
imposes the tangent space of the mixed strategy simplex on all the probability distribu- 
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tions modelling a game, while probability theory associates different tangent spaces with 
each probability distribution. It is natural to expect that these different adopted tangent 
spaces will lead to different optimization outcomes. 

In this work, we have shown that we can define and employ probability distributions 
possessing properties which differ from any "contained" within the mixed strategy sim- 
plex. These probability distributions possess a different differential geometry to that of 
the simplex. This has not generally been considered as probability spaces are not sup- 
posed to possess a geometrical interpretation. However, optimizing random functions 
within probability spaces often takes advantage of the geometrical properties of those 
spaces, and when those spaces are isomorphically embedded within enlarged probability 
spaces, then those geometrical properties must be preserved. 

We further note that mixed strategy spaces are supposed to contain all cases of deter- 
ministic dependencies. Every deterministic dependency equates to every possible func- 
tional dependency, and there are standard techniques for dealing with these functional 
dependencies. Players can embed their decision making processes within determinis- 
tic functional spaces of arbitrary dimension and scope. The resulting analysis must be 
consistent with multi-variate calculus and differential geometry. Should probability dis- 
tributions be applied to these analytical structures, then the analysis should be consistent 
with probability theory. 

There are essentially no limits to the scope of the analysis that can be brought to bear 
by a rational optimizing agent in a game. And game theory needs to provide a treatment 
consistent with these other approaches. If a player, following the rules of game theory, 
cannot accurately calculate properties of a game, then they have bounded rationality. 
In order to properly calculate game properties, players must use isomorphic probability 
spaces. Isomorphic mappings are necessary in order to exhibit unbounded rationality. 

In this paper, we hold that game theory must be fully consistent with both probability 
theory and optimization theory in general. Further, we hold that rational players must 
be able to reproduce any result from probability theory or optimization theory when 
analyzing a game or a decision tree. Indeed, a rational player should, if they chose, be 
able to exclusively use techniques from probability theory and find perfect accord with 
the results of game theory. Probability theory mandates that appropriate constraints 
designed to preserve tangent spaces must be used whenever probability distributions are 
embedded within an enlarged space in order to preserve all properties. Game theory 
has eschewed use of any constraints when embedding distributions within the mixed 
strategy probability space, and this leads to contradictions with probability theory. These 
discrepancies stem from the different tangent spaces adopted by probability theory and 
game theory, and an examination of these issues promises to cast light on some of the 
paradoxes of game theory. At the very least, these issues require examination even if 
only to establish their irrelevance. 
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In this work, we consider how to locate the best possible optima from many differ- 
ent functions defined over different incommensurate spaces. One way to approach this 
problem is to sequentially select each space, and then each function within that space, 
and then to locate each of the optima of that function, and finally to compare all op- 
tima to locate the best outcome. An alternative approach is to embed every possible 
function from each space into a single enlarged function, and then to apply standard 
techniques to locate the optima of that function. This approach is in common use in 
decision theory, game theory, and in artificial intelligence where multistage search and 
decision problems are concatenated together into a single, enlarged, multivariate map- 
ping from choices to outcomes. However, the typical embeddings used in these fields do 
not preserve gradient information specific to the source function. That is, an embedding 
of a source function f{x) within a surface g{x, y) can be via either limy^y^^g{x, y) = f{x) 
or g{x,y)\y=yQ = f{x). The first of these methods does not necessarily preserve gradient 
information as \im.y^y^^ V g {x , y) ^ \J g{x,y^\y=y^^ = Vf{x). In other words, the surface 
gradient generally does not replicate the line gradient of the function embedded within 
it. This means that a single surface containing many embedded functions can't repro- 
duce gradient information and hence can't be used to locate optima of those embedded 
functions. 



158 CHAPTER 12. CONCLUSION 



Bibliography 



[1] J. von Neumann and O. Morgenstern. Theory of Games and Economic Behavior. 
Princeton University Press, Princeton, 1944. 

[2] J. F. Nash. Equilibrium points in n-person games. Proceedings of the National 
Academy of Sciences of the United States of America, 36(l):48-49, 1950. 

[3] J. Nash. Non-cooperative games. Annals of Mathematics, 54(2):286-295, 1951. 

[4] H. W. Kuhn. Extensive games and the problem of information. In H. W. Kuhn and 
A. W. Tucker, editors. Contributions to the Theory of Games, Volume II, Princeton 
Annals of Mathematical Studies, No. 28, Princeton, 1953. Princeton University 
Press. 

S. Hart. Games in extensive and strategic forms. In R. J. Aumann and S. Hart, 
editors. Handbook of Game Theory with Economic Applications, pages 19-40, Am- 
sterdam, 1992. North Holland. 

R. Selten. A reexamination of the perfectness concept for equilibrium points in 
extensive games. International Journal of Game Theory, 4:25-55, 1975. 

D. Chatterjee. Abstract Algebra. Prentice-Hall, New Delhi, 2005. 

K. Ito. Introduction to Probability Theory. Cambridge University Press, Cambridge, 
1984. 

[9] R. M. Gray. Probability, Random Processes and Ergodic Processes. Springer, Dor- 
drecht, 2009. 

[10] P. Walters. An Introduction to Ergodic Theory. Springer- Verlag, New York, 1982. 

[11] E. Sernesi. Linear Algebra: A Geometric Approach. Chapman and Hall, Boca 
Raton, 1993. 

[12] H.-O. Georgii. Stochastics: Introduction to Probability and Statistics, de Gruyter, 
Berlin, 2008. 

159 



[5 
[6 

[7; 



160 BIBLIOGRAPHY 

[13] F. Burk. Lebesgue Measure and Integration: An Introduction. New York, Wiley, 
1998. 



[14 
[15 
[16 

[17; 

[18 

[19 

[20 

[21 

[22; 

[23; 

[24 

[25 
[26; 

[27; 

[28; 
[29 



M. Hayashi. Quantum Information: An Introduction. Springer, Berlin, 2006. 

P. E. Pfeiffer. Probability for Application. Springer, New York, 1990. 

A. Gersho and R. M. Gray. Vector Quantization and Signal Compression. Springer, 
London, 1991. 

M. Insall, T. Rowland, and E. W. Weisstein. Embedding. From MatliWorld — A 



Wolfram Web Resource. http://mathworld.wolfram.com/Embedding.html, 2009. 



F. P. Ramsey. A mathematical theory of savings. Economic Journal, 38(152):543- 
559, 1928. 

D. G. Kelly. Introduction to Probability. Macmillan, New York, 1994. 

T. M. Cover and J. A. Thomas. Elements of Information Theory. Wiley, New York, 
1991. 

E. van Damme. Strategic equilibrium. In R. J. Aumann and S. Hart, editors. Hand- 
book of Game Theory with Economic Applications, pages 1521-1596, Amsterdam, 
1992. North Holland. 

J. Pinter. Global Optimization. From MathWorld- 

A Wolfram Web Resource, created by Eric W. Weisstein. 



http://mathworld.wolfram.com/GlobalOptimization.html 



R. J. Aumann. Subjectivity and correlation in randomized strategies. Journal of 
Mathematical Economics, 1:67-96, 1974. 

P. Milgrom and J. Roberts. Predation, reputation, and entry deterrence. Journal 
of Economic Theory, 27:280-312, 1982. 

R. Selten. The chain store paradox. Theory and Decision, 9:127-159, 1978. 

R. W. Rosenthal. Games of perfect information, predatory pricing and the chain- 
store paradox. Journal of Economic Theory, 25:92-100, 1981. 

D. M. Kreps and R. Wilson. Reputation, and imperfect information. Journal of 
Economic Theory, 27:253-279, 1982. 

L. H. Davis. No chain store paradox. Theory and Decision, 18(2):139-144, 1985. 

W. Trockel. The chain-store paradox revisited. Theory and Decision, 21(2):163- 
179, 1986. 



BIBLIOGRAPHY 161 

[30] R. Wilson. Strategic models of entry deterrence. In R. J. Aumann and S. Hart, 
editors, Handbook of Game Theory with Economic Applications, pages 305-329, 
Amsterdam, 1992. North Holland. 

[31] D. M. Kreps. Corporate culture and economic theory. In J. E. Alt and K. A. 
Shepsle, editors. Perspectives on Positive Political Economy, page 1, Cambridge, 
UK, 1990. Cambridge University. 

[32] J. Berg. Turst, reciprocity, and social history. Games and Economic Behaviour, 
10:122-142, 1995. 

[33] B. King-Casas, D. Tomlin, C. Anen, C. F. Camerer, S. R. Quartz, and P. R. 
Montague. Getting to know you: Reputation and trust in a two-person economic 
exchange. Science, 308:78-83, 2005. 

[34] W. Giith, R. Schmittberger, and B. Schwarze. An experimental analysis of ultima- 
tum bargaining. Journal of Economic Behavior and Organization, 3(4):367-388, 
1982. 

[35] I. Stahl. Bargaining Theory. Economic Research Institute, Stockholm, 1972. 

[36] A. Rubinstein. Perfect equilibrium in a bargaining model. Econometrica, 50:97-109, 
1982. 

[37] K. Binmore, A. Shaked, and J. Sutton. Testing noncooperative bargaining theory: 
A preliminary study. American Economic Review, 75:1178-1180, 1985. 

[38] J. Ochs and A. E. Roth. An experimental study of sequential bargaining. American 
Economic Review, 79(3):355-384, 1989. 

[39] G. E. Bolton. A comparative model of bargaining: Theory and evidence. American 
Economic Review, 81(5):1096-1136, 1991. 

[40] H. Oosterbeek, R. Sloof, and G. van de Kuilen. Cultural differences in ultima- 
tum game experiments: Evidence from a meta-analysis. Experimental Economics, 
7:171-188, 2004. 

[41] E. Hoffman, K. A. McCabe, and V. L. Smith. On expectations and the monetary 
stakes in ultimatum games. International Journal of Game Theory, 25(3):289-302, 
1996. 

[42] R. Slonim and A. E. Roth. Learning in high stakes ultimatum games: An experi- 
ment in the Slovak republic. Econometrica, 66(3):569-596, 1998. 



162 BIBLIOGRAPHY 

[43] L. A. Cameron. Raising the stakes in the ultimatum game: Experimental evidence 
from Indonesia. Economic Inquiry, 37(l):47-59, 1999. 

[44] A. E. Roth, V. Prasnikar, M. Okuno-Fujiwara, and S. Zamir. Bargaining and mar- 
ket behavior in Jerusalem, Ljubljana, Pittsburgh, and Tokyo. American Economic 
Review, 81(5):1068-1095, 1991. 

[45] J. Henrich. Does culture matter in economic behavior? Ultimatum game bargaining 
among the Machiguenga of the Peruvian Amazon. American Economic Review, 
90(4):973-979, 2000. 

[46] R. H. Thaler. The ultimatum game. Journal of Economic Perspectives, 2(4):195- 
206, 1988. 

[47] A. E. Roth. Bargaining experiments. In J. Kagel and A. E. Roth, editors. Handbook 
of Experimental Economics, Princeton, NJ, 1995. Princeton University Press. 

[48] C. Camerer and R. H. Thaler. Anomalies - ultimatums, dictators and manners. 
Journal of Economic Perspectives, 9(2):209-219, 1995. 

[49] S. Zamir. Rationality and emotions in ultimatum bargaining, mimeo. Lecture, 
Conference Des Annales, June 19, 2000. 

[50] E. Winter and S. Zamir. An experiment with ultimatum bargaining in a chang- 
ing environment. The Hebrew University, Center for Rationality and Interactive 
Decision Theory, DP No. 159, 1997. 

[51] B. J. Ruffle. More is better, but fair is fair: Tipping in dictator and ultimatum 
games. Games and Economic Behavior, 23(2):247-265, 1998. 

[52] V. Prasnikar and A. E. Roth. Considerations of fairness and strategy: Experimental 
data from sequential games. Quarterly Journal of Economics, 107(3):865-888, 1992. 

[53] M. Rabin. Incorporating fairness into game theory and economics. American 
Economic Review, 83(5):1281-1302, 1993. 

[54] G. Loewenstein, I. Samuel, C. Camerer, and L. Babcock. Self-serving assessments 
of fairness and pretrial bargaining. Journal of Legal Studies, 22:135-159, 1993. 

[55] R. Forsythe, J. L. Horowitz, N. E. Savin, and M. Sefton. Fairness in simple bar- 
gaining experiments. Games and Economic Behavior, 6(3):347-369, 1994. 

[56] S. Blount. When social outcomes aren't fair: The effect of causal attributions on 
preferences. Organizational Behavior and Human Decision Processes, 63 (2): 131- 
144, 1995. 



BIBLIOGRAPHY 163 

[57] G. E. Bolton and A. Ockenfels. ERG: A theory of equity, reciprocity and competi- 
tion. American Economic Review, 90(1):166-193, 2000. 

[58] J. H. Kagel, G. Kim, and D. Moser. Fairness in ultimatum games with asymmetric 
information and asymmetric payoffs. Games and Economic Behavior, 13(1):100- 
110, 1996. 

[59] S. J. Burnell, L. Evans, and S. Yao. The ultimatum game: Optimal strategies 
without fairness. Games and Economic Behavior, 26:221-252, 1999. 

[60] M. Dufwenberg and G. Kirchsteiger. A theory of sequential reciprocity, mimeo, 
GentER for Economic Research, Tilberg, 1998. 

[61] A. Falk and U. Fischbacher. A theory of reciprocity. Institute for Empiri- 
cal Research in Economics: Working Paper Series, Working Paper No. 6, See 



http://www.unizh.ch/iew/wp/ , 2000 



[62] G. Kirchsteiger. The role of envy in ultimatum games. Journal of Economic 
Behavior and Organization, 25(3):373-390, 1994. 

[63] G. E. Bolton and R. Zwick. Anonymity versus punishment in ultimatum bargaining. 
Games and Economic Behavior, 10:95-121, 1995. 

[64] E. Fehr and K. M. Schmidt. A theory of fairness, competition and cooperation. 
Quarterly Journal of Economics, 114:817-868, 1999. 

[65] D. Levine. Modeling altruism and spitefulness in experiments. Review of Economic 
Dynamics, 1:593-622, 1998. 

[66] E. Hoffman, K. McGabe, K. Shachat, and V. Smith. Preferences, property rights 
and anonymity in bargaining games. Games and Economic Behavior, 7:346-380, 
1994. 

[67] A. E. Roth and I. Erev. Learning in extensive form games: Experimental data and 
simple dynamic models in the intermediate term. Games and Economic Behavior, 
8(1):164-212, 1995. 

[68] J. Gale, K. G. Binmore, and L. Samuelson. Learning to be imperfect: The ultima- 
tum game. Games and Economic Behavior, 8(l):56-90, 1995. 

[69] N. J. Vriend. Will reasoning improve learning? Economics Letters, 55(1):9-18, 
1997. 

[70] J. Duffy and N. Feltovich. Does observation of others affect learning in strategic 
environments? An experimental study. International Journal of Game Theory, 
28(1):131-152, 1999. 



164 BIBLIOGRAPHY 

[71] M. A. Nowak, K. M. Page, and K. Sigmund. Fairness versus reason in the ultimatum 
game. Science, 289:1773-1775, 2000. 

[72] S. Huck and J. Oechssler. The indirect evolutionary approach to explaining fair 
allocations. Games and Economic Behavior, 28:13-24, 1999. 

[73] W. Giith and M. Yaari. An evolutionary approach to explain reciprocal behavior 
in a simple strategic game. In U. Witt, editor. Explaining Process and Ghange: 
Appproaches to Evolutionary Economics, pages 23-34, Ann Arbor, 1992. 

[74] R. Peters. Evolutionary stability in the ultimatum game. Group Decision and 
Negotiation, 9(4):315-324, 2000. 

[75] G. Hardin. The tragedy of the commons. Science, 162:1243-1248, 1968. 

[76] E. Fehr and S. Gachter. Cooperation and punishment in public goods experiments. 
American Economic Review, 90:980-994, 2000. 

[77] M. A. Nowak and K. Sigmund. Evolution of indirect reciprocity by image scoring. 
Nature, 393:573-577, 1998. 

[78] C. Wedekind and M. Milinski. Cooperation through image scoring in humans. 
Science, 288:850-852, 2000. 

[79] E. Fehr and U. Fischbacher. Social norms and human cooperation. Trends in 
Gognitive Sciences, 8(4):185-190, 2004. 

[80] M. A. Nowak and K. Sigmund. Evolution of indirect reciprocity. Nature, 437:1291- 
1298, 2005. 

[81] R. Beausoleil K.-Y. Chen, T. Hogg. A practical quantum mecha- 



nism for the public goods game. Eprint Archive: quant-phys/0301013| (See 



http://arxiv.org/abs/quant-ph/0301013), 2003. 



[82] D. M. Kreps. A Gourse in Microeconomic Theory. Harvester Wheatsheaf, New 
York, 1990. 

[83] R. McKelvey and T. Palfrey. An experimental study of the centipede game. Econo- 
metrica, 60(4):803-836, 1992. 

[84] R. Nagel and F. F. Tang. An experimental study on the centipede game in normal 
form: An investigation on learning. Journal of Mathematical Psychology, 42:356- 
384, 1998. 

[85] K. Binmore. Modeling rational players: Part I. Economics and Philosophy, 3:179- 
214, 1987. 



BIBLIOGRAPHY 165 

86] K. Binmore. Modeling rational players: Part 11. Economics and Philosophy, 4:9-55, 
1988. 

87] K. Binmore. Rationality in the centipede. In R. Fagin, editor, Theoretical Aspects 
Of Rationality And Knowledge (TARK 1994): Proceedings of the 5th Conference on 
Theoretical Aspects of Reasoning about Knowledge, pages 150-159, Pacific Grove, 
California, 1994. Morgan Kaufmann. 

88] R. J. Aumann. Backward induction and common knowledge of rationality. Games 
and Economic Behavior, 8:6-19, 1995. 

89] K. Binmore. A note on backward induction. Games and Economic Behavior, 
17:135-137, 1996. 

90] R. J. Aumann. Reply to Binmore. Games and Economic Behavior, 17:138-146, 
1996. 

91] R. J. Aumann. Note on the centipede game. Games and Economic Behavior, 
23:97-105, 1998. 

92] P. Pettit and R. Sugden. The backward induction paradox. The Journal of Phi- 
losophy, 136(4):169-182, 1999. 

93] J. Broome and W. Rabinowicz. Backwards induction in the centipede game. Anal- 
ysis, 59(4):237-242, 1999. 

94] J. H. Sobel. Backward-induction arguments: A paradox regained. Philosophy of 
Science, 60(1):114-133, 1993. 

95] K. Sigmund and M. A. Nowak. A tale of two selves. Science, 290:949-950, 2000. 

96] R. Cooper, D. V. De Jong, R. Forsythe, and T. W. Ross. Cooperation without 
reputation: Experimental evidence from prisoner's dilemma games. Games and 
Economic Behavior, 12(2):187-218, 1996. 

97] M. Milinski and C. Wedekind. Working memory constrains human cooperation in 
the prisoner's dilemma. Proceedings of the National Academy of Sciences of the 
United States of America, 95(23):13755-13758, 1998. 

98] D. D. Davis and C. A. Holt. Equilibrium cooperation in two-stage games: Experi- 
mental evidence. International Journal of Game Theory, 28(1):89-109, 1999. 

99] R. T. A. Croson. Thinking like a game theorist: Factors affecting the frequency of 
equilibrium play. Journal of Economic Behavior and Organization, 41(3):299-314, 
2000. 



166 BIBLIOGRAPHY 

[100] R. Radner. Collusive behaviour in non-cooperative epsilon-equilibria in oligopolies 
with long but finite lives. Journal of Economic Theory, 22:136-154, 1980. 

[101] R. Radner. Can bounded rationality resolve the prisoner's dilemma. In A. Mas- 
Colell and W. Hildenbrand, editors, Essays in Honor of Gerard Debreu, pages 
387-399, Amsterdam, 1986. North-Holland. 

[102] F. Vegaredondo. Bayesian boundedly rational agents play the finitely repeated 
prisoner's dilemma. Theory and Decision, 36(2):187-206, 1994. 

[103] S. W. Harborne Jr. Common belief of rationality in the finitely repeated prisoners' 
dilemma. Games and Economic Behavior, 19(1):133-143, 1997. 

[104] N. Anthonisen. Strong rationalizability for two-player noncooperative games. Eco- 
nomic Theory, 13:143-169, 1999. 

[105] E. Fehr and U. Fischbacher. The nature of human altruism. Nature, 425:785-791, 
2003. 

[106] R. Axelrod. The Evolution of Gooperation. Basic Books, New York, 1984. 

[107] J. C. Harsanyi. Games with incomplete information played by "Bayesian" players. 
Management Science, 14(3): 159-182, 1967. 

[108] D. M. Kreps, P. Milgrom, J. Roberts, and R. Wilson. Rational cooperation in 
the finitely repeated prisoner's dilemma. Journal of Economic Theory, 27:245-252, 
1982. 

[109] D. Fudenberg and E. Maskin. The Folk Theorem in repeated games with discount- 
ing and incomplete information. Econometrica, 54:533-554, 1986. 

[110] R. Sarin. Simple play in the prisoner's dilemma. Journal of Economic Behavior 
and Orgamzatzon, 40(1):105-113, 1999. 

[Ill] A. Neyman. Cooperation in repeated games when the number of stages is not 
known. Econometrica, 67(l):45-64, 1999. 

[112] A. Neyman. Bounded complexity justifies cooperation in the finitely repeated pris- 
oner's dilemma. Economics Letters, 19:227-229, 1985. 

[113] A. Rubinstein. Finite automata play the repeated prisoner's dilemma. Journal of 
Economic Theory, 39:83-96, 1986. 

[114] I.-K. Cho and H. Li. How complex are networks playing repeated games. Economic 
Theory, 13:93-123, 1999. 



BIBLIOGRAPHY 167 

[115] H. Raff and D. Schmidt. Cumbersome coordination in repeated games. Interna- 
tional Journal of Game Theory, 29(1):101-118, 2000. 

[116] R. Evans and J. P. Thomas. Reputation and experimentation in repeated games 
with two long-run players. Econometrica, 65(5):1153-1173, 1997. 

[117] C. L. Sheng. A note on the prisoner dilemma. Theory and Decision, 36(3):233-246, 
1994. 

[118] E. Groes, H. J. Jacobsen, and B. Sloth. Adaptive learning in extensive form games 
and sequential equilibria. Economic Theory, 13:125-142, 1999. 

[119] Q. A. Song and A. Kandel. A fuzzy approach to strategic games. IEEE Transactions 
on Fuzzy Systems, 7(6):634-642, 1999. 

[120] N. Howard. Paradoxes of Rationality: Theory of Metagames and Political Behavior. 
MIT Press, Cambridge, Mass, 1971. 

[121] A. Rapoport. Escape from paradox. Scientific American, 217:50-56, July 1967. 

[122] P. D. Straffin. Game Theory and Strategy. Mathematical Association of America, 
Washington, 1993. 

[123] J. Eisert, M. Wilkens, and M. Lewenstein. Quantum games and quantum strategies. 
Physical Review Letters, 83(15) :3077-3080, 1999. 

[124] J. von Neumann. Mathematical Foundations of Quantum Mechanics. Princeton 
University Press, Princeton, 1955. First published in 1932. 



